Archiv für März 2012
Designing great data products – Summary blog from #StrataConf
Veröffentlicht von dakoller in data science am 29. März 2012
O’Reilly Radar – Insight, analysis, and research about emerging technologies.
viaDesigning great data products.
…the session with this title was one of the best sessions at this years StrataConf.
Fotos der StrataConf 2012 Buttons – Data Science als Wallpaper, Folienhintergrund
Veröffentlicht von dakoller in data science am 28. März 2012
O’Reilly hat bei der diesjährigen StrataConf ein paar sehr witzige Buttons verteilt: weil ich die selbst weiterverwenden möchte biete ich Fotos davon in einem Google+-Album zur freien Verwendung an.
Die Fotos sind groß & detailliert genug, um sie als Wallpaper, Folienhintergrund etc. zu verwenden.
Meine Favoriten sind:
Viel Spaß beim Weiterverwenden!
Startup Helps Small E-Businesses Stand Even With Amazon, Provides Pricing as a Service
Veröffentlicht von dakoller in data science am 27. März 2012
My Google Reader feed gave me yesterday a very inspiring use case for the tech cocktail of data mining, language processing & image recognition: Startup Helps Small E-Businesses Stand Even With Amazon, Provides Pricing as a Service.
This could be the next version of the earlier API mashups, these are connecting information in a much more relevant way… and the nice thing about it is that in many cases the business model is part of the package.
How do you identify specific content in an online email system (gmail, hotmail)?
Veröffentlicht von dakoller in nlp, Semantic Web am 26. März 2012
For Googlemail you could do it like this:
0) Think of the kind of content you want to be notified of and write down terms which might accompany this type of content in a text/attachment. (like „flight confirmation“ might also have fields like booking ID, departure date etc.)
1) if you need immediate user attention you might
1a) use google context sensitive gadgets ( https://developers.google.com/go… ) to identify content related to the type of content you are interested in. You can use a regular expression to match mails / attachments) or
1b) use the Google data API in case you are comfortable with handling in a backend process ( http://code.google.com/intl/de-D… ).
2) You can forward/post the mails/attachments to your web application and notify the user that you processed a kind of content.
In the context gadgets you are constrained in terms of processing to steps which you can do inside a JS-Script/an HTML-page), so regex evaluation is the most convenient solution, though it is not very flexible. (think of changing terms etc.)
When you need a learning model, you might want to use more sophisticated language processing toolkits, but they need a kind of backend processing capabilities, which requires regularly a backend server. (for Python look to www.nltk.org )
How do you identify specific content in an online email system (gmail, hotmail)?
What is the step by step process to build an ontology for news content?
Veröffentlicht von dakoller in data science, Semantic Web am 26. März 2012
In case you are targeting a news content ontology, a book like the (very good ) mentioned "Semantic Web for the working ontologist" ( http://www.amazon.de/Semantic-We… ) is only a part of the story: another crucial part is to manage the – like team-based – process of putting together the ontology.
In this area there are not so many solutions yet (especially when you don't want to train everybody in the team Semantic Web in detail): one notable tool is http://poolparty.biz/ , they focus on ontology & vocabulary creation for subject matter experts without requiring them to jump down to text file editing.
In case you have already a big bag of quality news content, you might also try to "fish" the relevant & specific terms using language processing tools out of the existing content and to put them into your ontology. …this can help you to get the critical basis for content very fast. (re. termfinding you might want to look to the Python-based NLTK.
What is the step by step process to build an ontology for news content?
Neue Inhalte
Veröffentlicht von dakoller in data science, Uncategorized am 22. März 2012
Unter Dienstleistungen und Produkte findet Ihr jetzt auf diesem Blog mein Angebot zum Thema Data Science.