Hive Analytics - Improved Word Cloud!

in Hive Proposals2 months ago (edited)

A few days ago I announced a new Word Cloud report on Hive Analytics. The first draft of this feature stopped a lot of common words, but still got hung up on some meta data in posts that were not properly getting filtered out.

I released a new update today that greatly improved how I filter out html tags but also adds more common words to the filter. You should see a much improved Word Cloud with these improvements. I still plan on making improvements to this feature so it will just get better.

Before

After

You can see this example using @beaker007's account is a lot more focused on content. Things like centerxcenter, centercenter, and classphishy no longer pollute the word cloud.

You can now easily share your word cloud on PeakD snaps by pressing:



You can check out your (or for anyone else) word cloud at https://hiveanalytics.usehive.com/word-cloud.


I am currently working on some improvements for the Curation reports. I will likely consolidate the three reports into one by improving the Curation Leaderboard to have all data.

I also want to improve Word Cloud feature by having multiple choices for how aggressive you want common words filtered out. I want the word cloud to really show what a user is talking about without losing their personality. I do see value however in aggressively removing words that don't offer a lot of value.

I also want to start adding in some of the Hive Engine reports I have planned.

If you have suggestions on words you think should be removed from the Word Cloud, feel free to use the feedback feature or leave a comment.


There are over 250 unique users using Hive Analytics!


Sort:  

Hi @themarkymark I think it would be interesting to cross-reference the data from these two tools:

https://holoz0r.github.io/HiveReportCard/

https://hivestats.io/

What do you think? I know you like data because you said so before.

I have no objection to working with Marky, but I think our tools diverge significantly in functionality.

Mine is focused on single user insights, whereas Marky's is much more robust, better implemented, requires a back end, - and focuses on ranking users, utilising all available data. Mine is not as sophisticated technically speaking.

Mine runs entirely in your browser (and is slower as a result) instead of having a server it talks to every time a query is made.

Edit: I was also going to (Eventually) build a word cloud function, but Marky beat me to it, so I don't see the need to duplicate functionality.

Have you thought about multi-lingual posts already? I think every posts has to be language filtered before doing the word cloud. Many hive posts have 2 or more languages in a single post.

It looks like my hyperlinks to various platforms in my post footer boilerplate are included. Is there a good way to filter out anything with

 https://*

or maybe

 www.*

to pare down this kind of result?

When i click this: https://hiveanalytics.usehive.com/word-cloud I only get this word cloud - probably the wrong one :-).

image.png

 2 months ago  

I'm no longer working on it.

hang on - not working on "word cloud" or the entire "Hive Analytics" thing? Both same error - would be a shame this project is not moving forward - did i miss something?

 2 months ago  

Entire project, it didn't get funding so I moved on to other projects.

Understand, not sure i voted (think I did) but thought it takes a while - sad though, i liked the project and you did not request a lot compared to other DHF proposals