Playing with my new toy :)

in #steemstem7 years ago

Working in academia has some advantages. There are no fixed working hours, you have a lot of freedom and you can put your hands on some new fancy equipment.

But there is more... You can put your hands on some cool software as well :)
No, I'm not talking about Uber Cool Ayasdi (for math) but I'm talking about ugly, custom-made script for plagiarism checking.

Take a deep breath

Ok, I'll start from myself, I'll check other high-ranking people and then I will continue to the 50th most rewarded account in the last 14 days.

The score goes from 0% for the absolute plagiarism to 100% for the absolutely original post.

The results:

Me: for the vast majority of my posts devoted to biology, chemistry and math, I've got 100%. My the worst score ever was 94%. That particular post was about my hobby, and it was made from multiple sources (about 5)

People I can trust: usually 100%, the worst score was 87%, but it's a false positive as well. The software simply picked-up some phrases here and there.

Ok, it can be trusted... Even for posts where you really can't be too original, the software still ranks you about 90%

Let's move down the list:

Let's check another random post from the same guy, 53%Something about Graphene, 70%: the 20th century was the age of plastics then its right to say that the 21st century is the age of graphene. from

Ok... It has something here and there but... Not a real plagiarism.

The same author also has one post with 48% (how on Earth my worst score was 90+ ?!)Next: <60%, because it's very similar to and wow, it's short

We have some 70-75s...

Below 40% for Geoingeneering, it's nothing but a Wiki:

underground transmission lines, radioactive waste disposal, oil and gas pipelines, and solar thermal storage facilities, there is need for the measurement of thermal resistivity of soils and backfill materials required for its achievement to be carried out by geotechnical investigations.

underground transmission lines, oil and gas pipelines, radioactive waste disposal, and solar thermal storage facilities. A geotechnical investigation will include surface exploration and subsurface exploration of a site

In Conclusion

My language community had about 30 people ever awarded. I personally blacklisted 3 or 4 and cleaned up my yard.
Today I can proudly say that all active members are trustful people, whose the worst level of education is M.Sc. Most of us are either Ph.D. or Ph.D. students. They can't plagiarize because they are actually making the original sources.

National curators, please clean up your yards. It's a very simple thing to do. If you want to do it.

If we keep paying for Wiki re-wording this is going nowhere because we will chase out all the smart people.

I really don't care if someone is poor or whatever with Steemit as the only source of income.
In that case - show some respect!

GOOD EU salary for skilled professionals is around 2.000 Eur. It's a 65/65 reward equivalent per day.
In the Eastern countries that salary is the equivalent of 20/20 reward.

It's insulting to put [ Copy - Wiki/ How Stuff Work - Paste - Change a few words ] post and expect to earn a daily salary of an expert. Those expert need to work 8 jours for that money.

Now I'm going to take a walk, see you tomorrow with blacklist updates.

It's time for <3 :* <3 :* <3 :* <3 :* <3 :* <3 :* <3 :* <3 :* <3 :* <3 :*

I can do it myself, but, have some fun and do something good for this community. It deserves the best and no less than that.

:* 4 U as well


If we keep paying for Wiki re-wording this is going nowhere because we will chase out all the smart people.

Couldn't agree more.

I really don't care if someone is poor or whatever with Steemit as the only source of income.
In that case - show some respect!

That means the person should try particularly hard to make a good quality post, and not vice versa.

It's insulting to put [ Copy - Wiki/ How Stuff Work - Paste - Change a few words ] post and expect to earn a daily salary of an expert. Those expert need to work 8 hours for that money.

Wouldn't say it better myself.

They took'ur jobs! :)

Double upvote for: "That means the person should try particularly hard to make a good quality post, and not vice versa."

I wonder if things like plagiarism checker could ever be like 90% (minimum) accurate. It would really help to filter people down.

From my perspective, the biggest problem is how to give binary, Yes/No answer to something fuzzy. If the whole structure is Wiki with some minor changes - is it a plagiarism, rephrasing or (poor) originality.

I'm certainly encouraging people to use such tools (more variety = more confident) and to scan some posts from time to time

I could really use such software to improve my own writing.

Reason: I have always been struggling with the demarcation line between original and plagiarism. As a student of Greek and Latin history I personally recognize that all our supposedly 20th century orginal ideas have precedents in Roman, Greek and Egyptian times, and that we just mix and improve ideas of others, because that is how our brains treats partially remembered information. Since I am very active in engineering and know a lot about engineering history, I have seen a lot of 19th century inventions being re-patented as new, without barely any modification, and actually receive legally protected status.

But in my original field of study, 'law', nothing is really 'new', since law is an old field of study and every discussion position has been taken somewhere in some legal jurisdiction; sometimes, ever. At least, if you shop around and read a lot in your field you discover your in-originality very quickly. I always found 'plagiarism' a very fishy ill-defined concept since labeling something as 'original' by experts in a field, is usually based on a lack of knowledge by experts regarding the history of their field. (this was in an era before such software existed)

So is it just software verifying if you copy pasted a fragment of a document without a reference? How many languages and countries does it cross-verify? Can it check if a paragraph found elsewhere was rewritten and reused in your document (which is common practice even among 'phd level original creators'), or is it software that can check whether an idea is original (proving an idea is original, even a patented idea, is a very hard thing to do, especially with software, that's why it has to be arbitrarily decided by arbiters and judges with imperfect knowledge and 'common sense'). Then again, A.I. might one day attain the level of abstraction needed to judge this in a more objective manner.

The result of my insecurity about 'originality' is that I have been sitting on a drawer filled with 'supposedly' original ideas for years, avoiding academic publishing, believing that being perfect in my search for prior art is more important than communicating an idea and progressing in open discussion. This conflicts with my ambition of being an entrepreneur and just going for it. Blogging is of course something else.

Is the general argument valid that the hunt for plagiarism might lead to a stagnation in innovation?

To remedy such insecurity it would be nice to have a tool I can use myself to baseline if I am actually a plageriser or an original thinker, or if my ideas are worthy of being published (e.g. is it sufficient to quote a source not to be a plageriser, or, for academic publishing, do your ideas have to be 100% 'original', in all aspects as well? ), and if software can help me make that distinction, at least for some aspects of writing, it would be great.

We can go into a discussion about peer review as well, but, this is supposed to be a question, not an article.

So yes, where can I get such software, which one is a reference in its class and is it affordable for a private person?

For me, the only trace of human intelligence is the ability to expand the existing and put it in some new context. To give a perspective, conclusion or at least opened question.

For example, my "most plagiarized" posts are those devoted to tanks.
You need to give production numbers, some elements how to recognize them and so on.
However, there is a plenty of space to put something original if you really like the topic.

For example, versions A-D had low-velocity gun. Ok... How fast it was? What was the penetrability and at what distance? How it stands against the opposing tanks?

Upgraded versions got a diffetrent gun. Why? A new opponent appeared? Has it got a different role of tactical level?