Introducing TechGenius

This is a post I wrote for the Instadat blog. For more news on big data follow @Instadat on Twitter or like us on Facebook.

We’re excited to announce the beta launch of TechGenius (http://techgenius.me), our news aggregator and research tool for the tech industry that we’ve been building for the last two months. “What’s so exciting about it?”, you might ask. Read about it below.

TechGenius (http://techgenius.me) Screenshot Oct 2, 2012

News Aggregators a Plenty

We’re well aware that there are plenty of news aggregators out there, each with something unique or special. Reddit and Google News are comfortably sitting at the top, and Digg just did their big relaunch. Newsle, News.me, and Flipboard are two news aggregators that use social data to suggest articles.

We decided to create a news aggregator and research tool, specific to the tech industry, that showcased the power of big data and artificial intelligence (aka advanced algorithms).

We collect news from top Tech RSS Feeds

We asked a few of our close friends, and online communities we’re part of and settled on collecting tech news from the following sources: TechCrunch, VentureBeat, Mashable, Engadet, Gizmodo, Forbes (Tech), Huffington Post (Tech), GigaOM, The Next Web, Ars Technica, Read Write Web, Cnet, Business Insider SAI, The Verge, All Things D, Techli, GeekWire, Fast Company (Tech), Wired (Business), and Pando Daily.

We automatically tag articles with keywords.

Currently, we have 623 keywords (important phrases, people, companies), a list that was initially curated, then by an auto-keyword generator that generates keywords using the Crunchbase API. We’ve collected over 27,000 articles since July 31st, and 120,000+ cases of our keywords in an article (avg. 4.4 keywords/article). The most commonly found keywords are:

1) Apple (5080)
2) iPhone (4300)
3) Facebook (4153)
4) App (4101)
5) Mobile (3582)
6) Video (3482)
7) Twitter (3470)
8) Launch (3445)
9) Google (3468)
10) Android (2320)

Followed by: Smartphone, Media, CEO, Tablet, Internet, Game, Software, iOS, iPad, Microsoft, Digital, Windows, Amazon, Startup, Raise, Blog, Website, etc.

We use visitor activity to identify trends and popular articles.

TechGenius tracks when a visitor reads an article, visits a keyword or author page, or shares an article on Twitter of Facebook. That data is used to identify popular authors for the week, trending keywords, and celebrated articles.

TechGenius User Profile Screenshot (Oct 2, 2012)

We use advanced algorithms to recommend articles to ours users.

For people who use their Facebook account to log-in to TechGenius, we use some serious techniques to generate recommendations. To begin, we generate three scores between every user and article that updates over time:

1) A “U Score”, based on the combined activity of similar users to any given article. (If you and I have read and liked similar articles in the past, future articles you like will result in a high U Score for me.)

2) A “K Score”, which combines your keyword preference (a score assigned between you and each keyword), and the occurrence of keywords in the particular article.

3) A “O Score”, which is an overall popularity score that’s assigned to each article.

Overtime, TechGenius learns which score suggests the best article to each user, automatically changing the weight of the three keywords based on each users activity.

How Article Scores are Calculated on TechGenius [Infographic] – Click image to see larger version.
Go to TechGenius.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *