<rss version="2.0" xmlns:syndicaat="http://www.hitsyndicaat.com/1.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
<title>MySyndicaat Feedbot: sachamonotti_Data and Text Mining</title><description>This is an automatic RSS 2.0 channel generated by MySyndicaat for feedbot: sachamonotti_Data and Text Mining</description><link>http://192.168.254.26/myfeed/blog/default/sachamonotti_Data and Text Mining</link><managingEditor>MySyndicaat Team</managingEditor><language>en</language><copyright>Copyright: MySyndicaat</copyright><item>
<title>Is Your City HyperLocal?</title><description><![CDATA[Seattle resident’s are spoiled for choice when it comes to hyperlocal blogging. In addition to a couple of networks – Next Door Media {My Ballard, PhinneyWood, Magnolia Voice, Queen Anne View, Fremont Universe} and Neighborlogs {Capitol Hill Seattle, Central District...]]></description><content:encoded><![CDATA[<p>Seattle resident’s are spoiled for choice when it comes to hyperlocal blogging. In addition to a couple of networks – Next Door Media {<a href="http://www.myballard.com/">My Ballard</a>, <a href="http://www.phinneywood.com/">PhinneyWood</a>, <a href="http://www.magnoliavoice.com/">Magnolia Voice</a>, <a href="http://www.queenanneview.com/">Queen Anne View</a>, <a href="http://www.fremontuniverse.com/">Fremont Universe</a>} and Neighborlogs {<a href="http://capitolhillseattle.com/">Capitol Hill Seattle</a>, <a href="http://www.centraldistrictnews.com/">Central District News</a>, <a href="http://www.thesouthlake.com/">The Southlake</a>, etc.} there are a number of independents (e.g. <a href="http://westseattleblog.com/blog/">West Seattle Blog</a>). All told, there may be up to 100 blogs which focus on residential issues in and around Seattle.</p><p>Is Seattle unusual in its coverage, or are there other places that have a thriving hyperlocal blogging culture? Please comment if your neighborhood is covered.</p><div class="feedflare"><a href="http://feeds.feedburner.com/~ff/DataMining?a=cHTu9R0HNZU:cOwlMR6qHgI:yIl2AUoC8zA"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=yIl2AUoC8zA"></a><a href="http://feeds.feedburner.com/~ff/DataMining?a=cHTu9R0HNZU:cOwlMR6qHgI:7Q72WNTAKBA"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=7Q72WNTAKBA"></a><a href="http://feeds.feedburner.com/~ff/DataMining?a=cHTu9R0HNZU:cOwlMR6qHgI:2mJPEYqXBVI"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=2mJPEYqXBVI"></a><a></a></div> ...]]></content:encoded><link>http://feedproxy.google.com/~r/DataMining/~3/cHTu9R0HNZU/is-your-city-hyperlocal.html</link><guid>http://datamining.typepad.com/data_mining/2009/07/is-your-city-hyperlocal.html</guid><author>Matthew Hurst</author><category>hyperlocal</category><pubDate>Fri, 03 Jul 2009 21:51:17 GMT</pubDate><source url="http://feeds.feedburner.com/DataMining">Data Mining: Text Mining, Visualization and Social Media</source></item>
<item>
<title>Interactive Dashboards: US.Gov Aces, Wimbledon Double Faults</title><description><![CDATA[<p>The US government has unveiled a dashboard to track IT spending, which looks at first sight a shining example of the power and value of business intelligence. The Wimbledon tennis dashboard, on the other hand, appears less than impressive </p>
<p><a href="http://www.nytimes.com/2009/07/01/technology/01dashboard.html?hpw"target=_blank>As reported</a> in <em>The New York Times</em>, the so-called IT Dashboard was put together by the US government in a remarkably short six weeks, and will "track and analyze the more than $70 billion a year that the federal government spends on information technology."</p>]]></description><link>http://www.intelligententerprise.com/blog/archives/2009/07/government_dash.html</link><guid>http://www.intelligententerprise.com/blog/archives/2009/07/government_dash.html</guid><author></author><category>business intelligence</category><pubDate>Thu, 02 Jul 2009 14:16:57 GMT</pubDate><source url="http://www.intelligententerprise.com/blog/movabletype/index.xml&#x25;3Bjsessionid&#x25;3D2EFNX2OF5IGLAQSNDLQSKH0CJUNN2JVN">The Intelligent Enterprise Blog</source></item>
<item>
<title>Government Dashboard Scores a Bulls-Eye</title><description><![CDATA[<p>The US government has unveiled a dashboard to track IT spending. Apparently put together in a remarkably short six weeks, it looks at first sight a shining example of the power and value of business intelligence. But don't take my word for it... browse the dashboard yourself!</p>
<p><a href="http://www.nytimes.com/2009/07/01/technology/01dashboard.html?_r=2&hpw"target=_blank>As reported in <em>The New York Times</em></a>, the so-called IT Dashboard will "track and analyze the more than $70 billion a year that the federal government spends on information technology." Some features of the IT spending dashboard include:<ul></p>]]></description><link>http://www.intelligententerprise.com/blog/archives/2009/07/government_dash.html</link><guid>http://www.intelligententerprise.com/blog/archives/2009/07/government_dash.html</guid><author></author><category>business intelligence</category><pubDate>Thu, 02 Jul 2009 13:23:56 GMT</pubDate><source url="http://www.intelligententerprise.com/blog/movabletype/index.xml&#x25;3Bjsessionid&#x25;3D2EFNX2OF5IGLAQSNDLQSKH0CJUNN2JVN">The Intelligent Enterprise Blog</source></item>
<item>
<title>Naughty Feeds</title><description><![CDATA[Do you have a naughty feed? Come on, admit it. You deliberately left out the title, or did you put in an empty summary? Maybe you’re the one who doesn’t put in any dates, or perhaps you set the permalink...]]></description><content:encoded><![CDATA[<p>Do you have a naughty feed? Come on, admit it. You deliberately left out the title, or did you put in an empty summary? Maybe you’re the one who doesn’t put in any dates, or perhaps you set the permalink to the home page of your blog. Well – you are a naughty blogger, shame on you!</p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/DataMining?a=BuQiIpKxp0Y:MAH-Eo9aD-s:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/DataMining?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/DataMining?a=BuQiIpKxp0Y:MAH-Eo9aD-s:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/DataMining?d=7Q72WNTAKBA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/DataMining?a=BuQiIpKxp0Y:MAH-Eo9aD-s:2mJPEYqXBVI"><img src="http://feeds.feedburner.com/~ff/DataMining?d=2mJPEYqXBVI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/DataMining?a=BuQiIpKxp0Y:MAH-Eo9aD-s:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/DataMining?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/DataMining/~4/BuQiIpKxp0Y" height="1" width="1"/>]]></content:encoded><link>http://feedproxy.google.com/~r/DataMining/~3/BuQiIpKxp0Y/naught-feeds.html</link><guid>http://datamining.typepad.com/data_mining/2009/07/naught-feeds.html</guid><author>Matthew Hurst</author><pubDate>Thu, 02 Jul 2009 02:26:21 GMT</pubDate><source url="http://feeds.feedburner.com/DataMining">Data Mining: Text Mining, Visualization and Social Media</source></item>
<item>
<title>Bing haz Twitter</title><description><![CDATA[This is very cool. When Bing recognizes a search for a celeb (of the real space or other varieties), it will provide an answer composed of their Twitter identity and recent tweets. Not yet rolled out for me, but the...]]></description><content:encoded><![CDATA[<p><a href="http://www.bing.com/community/blogs/search/archive/2009/07/01/bringing-a-bit-of-twitter-to-bing.aspx">This is very cool</a>. When Bing recognizes a search for a celeb (of the real space or other varieties), it will provide an answer composed of their Twitter identity and recent tweets. Not yet rolled out for me, but the Bing blog has a screen shot (repeated here).</p>  <p><a href="http://datamining.typepad.com/.a/6a00d8341c994053ef011570a8586e970c-pi"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="143" alt="image" src="http://datamining.typepad.com/.a/6a00d8341c994053ef0115719d7ca6970b-pi" width="518" border="0" /></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/DataMining?a=TQKd3XP1cKU:JROxgVTLNmE:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/DataMining?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/DataMining?a=TQKd3XP1cKU:JROxgVTLNmE:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/DataMining?d=7Q72WNTAKBA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/DataMining?a=TQKd3XP1cKU:JROxgVTLNmE:2mJPEYqXBVI"><img src="http://feeds.feedburner.com/~ff/DataMining?d=2mJPEYqXBVI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/DataMining?a=TQKd3XP1cKU:JROxgVTLNmE:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/DataMining?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/DataMining/~4/TQKd3XP1cKU" height="1" width="1"/>]]></content:encoded><link>http://feedproxy.google.com/~r/DataMining/~3/TQKd3XP1cKU/bing-haz-twitter.html</link><guid>http://datamining.typepad.com/data_mining/2009/07/bing-haz-twitter.html</guid><author>Matthew Hurst</author><category>search</category><pubDate>Thu, 02 Jul 2009 02:23:40 GMT</pubDate><source url="http://feeds.feedburner.com/DataMining">Data Mining: Text Mining, Visualization and Social Media</source></item>
<item>
<title>CEP, Events, and Continuous {Transformation | Intelligence}</title><description><![CDATA[<p>Given that BI thought leaders are wrestling with the notion of events, perhaps we will see a BI-mainstreaming of event processing in the not-too-distant future. Myself, <a href="http://www.intelligententerprise.com/blog/archives/2007/09/complex_event_p.html" target="_blank">I was way ahead of the game</a> in <a href="http://www.intelligententerprise.com/showArticle.jhtml?articleID=159907832" target="_blank">my expectations of demand</a> for BI access to stream sources. While a combination of legacy database and analytical technology has held BI back, lack of perception of need has been a far greater factor, especially given the under-utilization of conventional BI decades after the term first became popular. </p>
<p>Interest in streams and events has definitely picked up in the last few months -- <a href="http://www.intelligententerprise.com/blog/archives/2008/12/bi_on_content_f.html" target="_blank">I've reported on novel applications</a> for "continuous transformation" and otherwise done <a href="http://www.coral8.com/news/pr/337.html" target="_blank">a bit of writing to promote awareness</a> -- and next year could very well be the break-out year for BI on data and event streams.</p>]]></description><link>http://www.intelligententerprise.com/blog/archives/2009/07/cep_events_and.html</link><guid>http://www.intelligententerprise.com/blog/archives/2009/07/cep_events_and.html</guid><author></author><category>business intelligence</category><pubDate>Wed, 01 Jul 2009 15:23:36 GMT</pubDate><source url="http://www.intelligententerprise.com/blog/movabletype/index.xml&#x25;3Bjsessionid&#x25;3D2EFNX2OF5IGLAQSNDLQSKH0CJUNN2JVN">The Intelligent Enterprise Blog</source></item>
<item>
<title>Mulling the Mystery of Microsoft&apos;s BI Market Share</title><description><![CDATA[<p>How do you gauge Microsoft's business intelligence market share when it gives BI functionality away for free? That was a bit of a puzzle even before Microsoft's <a href="http://www.intelligententerprise.com/channels/performance_management/showArticle.jhtml?articleID=212902660"target=_blank>January announcement</a> that it would end development of the PerformancePoint Server, the one and only entirely BI-focused product the company had. As part of that move, Microsoft now bundles what it calls "PerformancePoint Services" -- dashboarding, scorecarding and analytic capabilities -- into the enterprise edition of SharePoint. That's bundled as in free, just as Reporting Services and Analysis Services have long been bundled with Microsoft SQL Server. Microsoft's BI front end is Excel, the general-purpose spreadsheet tool that's part of the Office suite -- seldom purchased separately or used exclusively for BI.</p>
<p>So now if you own Microsoft SQL Server, SharePoint (Enterprise) and Office, you already own Microsoft BI, and these days, Microsoft executives take every opportunity to point that out. In an interview earlier this month, Kristina Kerr, group product manager of Microsoft BI told me, "the move we made in January has definitely spurred on a lot of growth and a lot of interest in BI among SharePoint Enterprise customers. These are tough economic times, so a lot of customers are looking internally to see what they already own and see how they can make the most of it."</p>]]></description><link>http://www.intelligententerprise.com/blog/archives/2009/07/mulling_the_mys.html</link><guid>http://www.intelligententerprise.com/blog/archives/2009/07/mulling_the_mys.html</guid><author></author><category>business intelligence</category><pubDate>Wed, 01 Jul 2009 13:24:05 GMT</pubDate><source url="http://www.intelligententerprise.com/blog/movabletype/index.xml&#x25;3Bjsessionid&#x25;3D2EFNX2OF5IGLAQSNDLQSKH0CJUNN2JVN">The Intelligent Enterprise Blog</source></item>
<item>
<title>Twisting Terms to Make BI Market Share Claims</title><description><![CDATA[<p>I always look forward to IDC's annual BI market shares, waiting to see who comes out on top and who is losing ground. I'm sure everyone noticed that they were, in fact, a month early this year! Call me a bean counter at-heart, but I like the irrefutable, quantifiable comparison they bring that other evaluations (including my own BI Scorecard) lack. Or so one would think.</p>
<p>Most product evaluations involve a degree of subjectivity, with varying definitions and opinions of what capabilities and criteria matter more. The IDC market share figures, on the other hand, are cold-hard facts: what were the revenues, who's leading, who's growing. While such data may have been somewhat trackable on 10Ks when BI vendors were independent, it's now largely impossible as BI is often but a small part of a larger company. Software vendors don't have to report their revenues by market segment, and throughout the year, we only get vague, imprecise references about how the BI business is going. The IDC report lays out the breakdown by vendor.</p>]]></description><link>http://www.intelligententerprise.com/blog/archives/2009/06/twisting_terms.html</link><guid>http://www.intelligententerprise.com/blog/archives/2009/06/twisting_terms.html</guid><author></author><category>business intelligence</category><pubDate>Tue, 30 Jun 2009 16:18:25 GMT</pubDate><source url="http://www.intelligententerprise.com/blog/movabletype/index.xml&#x25;3Bjsessionid&#x25;3D2EFNX2OF5IGLAQSNDLQSKH0CJUNN2JVN">The Intelligent Enterprise Blog</source></item>
<item>
<title>Free, Blogs</title><description><![CDATA[The Blog Herald comments on the Free skirmish between Gladwell and Anderson by pointing out The blog is alive and kicking, if nothing else but because it is hard to pick critics and arguments to pieces in 140 characters or...]]></description><content:encoded><![CDATA[<p>The <a href="http://www.blogherald.com/2009/06/30/reviewing-the-reviewers-the-blog-is-alive-and-kicking/">Blog Herald</a> comments on the Free skirmish between <a href="http://www.newyorker.com/arts/critics/books/2009/07/06/090706crbo_books_gladwell?currentPage=all">Gladwell</a> and <a href="http://www.longtail.com/the_long_tail/2009/06/dear-malcolm-why-so-threatened.html">Anderson</a> by pointing out</p><blockquote><p>The blog is alive and kicking, if nothing else but because it is hard to pick critics and arguments to pieces in 140 characters or less.</p></blockquote><p>I <a href="http://datamining.typepad.com/data_mining/2008/01/chris-anderson.html">wrote</a> <a href="http://datamining.typepad.com/data_mining/2008/02/free-of-underst.html">quite</a> a <a href="http://datamining.typepad.com/data_mining/2007/10/is-free-good.html">bit</a> about the book when it was being formulated. Now, I’m looking for the (free) online version but haven’t found it yet. Chris/Wired is known for <a href="http://norvig.com/fact-check.html">erring on the side of sensationalism</a> to move units (as is appropriate in the mediate business), so I wonder what his investment in this new thesis is.</p><div class="feedflare"><a href="http://feeds.feedburner.com/~ff/DataMining?a=AyZB7jvXlNs:zr0MjxjJ6yk:yIl2AUoC8zA"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=yIl2AUoC8zA"></a><a href="http://feeds.feedburner.com/~ff/DataMining?a=AyZB7jvXlNs:zr0MjxjJ6yk:7Q72WNTAKBA"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=7Q72WNTAKBA"></a><a href="http://feeds.feedburner.com/~ff/DataMining?a=AyZB7jvXlNs:zr0MjxjJ6yk:2mJPEYqXBVI"><img></a></div> ...]]></content:encoded><link>http://feedproxy.google.com/~r/DataMining/~3/AyZB7jvXlNs/free-blogs.html</link><guid>http://datamining.typepad.com/data_mining/2009/06/free-blogs.html</guid><author>Matthew Hurst</author><category>economics</category><pubDate>Tue, 30 Jun 2009 13:11:17 GMT</pubDate><source url="http://feeds.feedburner.com/DataMining">Data Mining: Text Mining, Visualization and Social Media</source></item>
<item>
<title>Bucking the Cloud Computing Hype</title><description><![CDATA[<p>Years ago I was running a large software development shop when the "offshoring" movement became all that and a bag of chips. I felt extreme pressure to fire some of my staff in the US and cut a deal with the dozens of outsourcing firms that were calling me daily. I was asked about offshoring so many times that it was clear to me that the pressure was on. </p>
<p>I had to buck the offshoring hype at the time, but for good reasons. First, our systems had issues around quality and architecture. Until we corrected those issues, bundling the code up for development offshore would do little good, and could actually kill the software. Once we stabilized the code, then I could pick portions of it to be developed in outsourced development shops, but not until then. It was very tough explaining that to laymen who just saw the dollars and cents, along with the trend and hype that told them offshoring was the way to go.<br />
</p>]]></description><link>http://www.intelligententerprise.com/blog/archives/2009/06/bucking_the_clo.html</link><guid>http://www.intelligententerprise.com/blog/archives/2009/06/bucking_the_clo.html</guid><author></author><category>enterprise applications</category><pubDate>Mon, 29 Jun 2009 11:23:33 GMT</pubDate><source url="http://www.intelligententerprise.com/blog/movabletype/index.xml&#x25;3Bjsessionid&#x25;3D2EFNX2OF5IGLAQSNDLQSKH0CJUNN2JVN">The Intelligent Enterprise Blog</source></item>
<item>
<title>Measure, Don&amp;rsquo;t Guess &amp;ndash; Growth in the Blogosphere</title><description><![CDATA[Charles Arthur writes a piece about the slow demise of the blogosphere. Arthur asserts that bloggers are a fading bread, and that they've all gone to Facebook, and especially Twitter. Arthur claims to have come to this conclusion via a...]]></description><content:encoded><![CDATA[<p>Charles Arthur writes a piece about <a href="http://www.guardian.co.uk/technology/2009/jun/24/charles-arthur-blogging-twitter">the slow demise of the blogosphere</a>. Arthur asserts that bloggers are a fading bread, and that</p><blockquote><p>they've all gone to Facebook, and especially Twitter.</p></blockquote><p>Arthur claims to have come to this conclusion via a mixture of anecdotal evidence, and data provided by Technorati. Let’s do our own experiment to see if the blogosphere is fading. Let’s take a very mundane search term – one that we expect to be a constant background in the sea of celebrity death buzz, hi-tech launches and liver transplants : ‘car repair’. As we can see from the Blogpulse graph, it is pretty stable with a few blips here and there:</p><p><a href="http://datamining.typepad.com/.a/6a00d8341c994053ef01157173fdb9970b-pi"><img border="0" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" alt="carrepair" title="carrepair" width="500" src="http://datamining.typepad.com/.a/6a00d8341c994053ef01157173fdc0970b-pi" height="300"></a></p><p>Blogpulse plots the percentage of all blog posts on this topic. If the blogosphere were dying, the absolute counts would also be slowly reducing (even if the percentages were staying the same as the graph shows).</p><p>On Jan 4th, 0.026 of posts were on the term ‘car repair’. This translates to 142 posts (Blogpulse allows you to click through to see the number of hits). On June 21st, where there were 0.027 % of posts on the term, Blogpulse registers 144 hits. Ok, I don’t really see any slacking off there. What happens when we look at more data points? If we do this for ‘car repair’ and ‘birthday’ we get the results below. Here I’ve normalized the values by the percentage of posts (count/percent</p> ...]]></content:encoded><link>http://feedproxy.google.com/~r/DataMining/~3/B22r1W9SIzs/measure-dont-guess-growth-in-the-blogosphere.html</link><guid>http://datamining.typepad.com/data_mining/2009/06/measure-dont-guess-growth-in-the-blogosphere.html</guid><author>Matthew Hurst</author><pubDate>Sat, 27 Jun 2009 23:41:07 GMT</pubDate><source url="http://feeds.feedburner.com/DataMining">Data Mining: Text Mining, Visualization and Social Media</source></item>
<item>
<title>The Long Tail of Text Mining</title><description><![CDATA[In systems that execute inferences via a pipeline of steps, every step is an opportunity for failure. Therefore, it is imperative that implementers focus attention on the details of every step. For example, in text mining, systems have to Import...]]></description><content:encoded><![CDATA[<p>In systems that execute inferences via a pipeline of steps, every step is an opportunity for failure. Therefore, it is imperative  that implementers focus attention on the details of every step. For example, in text mining, systems have to</p><ol><li>Import and parse documents – did you get the title? did you recognize the footers? did you strip out the page numbers?</li><li>Identify sentences and words – is the document in a latin alphabet language? are there word separations? are you dealing with acronyms? how is your unicode-fu?</li><li>Provide part of speech tags for the words – is the text an example of the type of data that the POS tagger trained on?</li><li>Identify entities – are you prepared to identify unusual names like Barack Obama?</li><li>etc.</li></ol><p>I’ve seen a couple of attention to details bugs surface in the past few hours. The first was reported by <a href="http://searchengineland.com/google-thinks-michael-jackson-died-at-age-65-in-2007-21659">Danny Sullivan</a>, in which Google (and this is still the case at the time of writing) thinks that Michael Jackson the writer is the most salient person with that name.</p><p><a href="http://datamining.typepad.com/.a/6a00d8341c994053ef01157173cde2970b-pi"><img border="0" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" alt="image" title="image" width="474" src="http://datamining.typepad.com/.a/6a00d8341c994053ef01157173cded970b-pi" height="94"></a></p><p>The second is visible on <a href="http://www.wesmirch.com/090627/h1255">WeSmirch</a>, in which the system fails to identify the title of Lisa Marie Presley’s blog, naming it ‘Create Free Blogs & Online Journals on MySpace Blogs’:</p> ...]]></content:encoded><link>http://feedproxy.google.com/~r/DataMining/~3/mFDC58V8dlo/the-long-tail-of-text-mining.html</link><guid>http://datamining.typepad.com/data_mining/2009/06/the-long-tail-of-text-mining.html</guid><author>Matthew Hurst</author><pubDate>Sat, 27 Jun 2009 23:10:30 GMT</pubDate><source url="http://feeds.feedburner.com/DataMining">Data Mining: Text Mining, Visualization and Social Media</source></item>
<item>
<title>Steve Irwin, Michael Jackson</title><description><![CDATA[When Steve Irwin, the famous crocodile hunter, was killed by a sting ray, 5.5% of the posts on that day in the blogosphere (September 05, 2006) mentioned his name. Yesterday, mentions of ‘Michael Jackson’ topped out around 3%. Things are...]]></description><content:encoded><![CDATA[<p>When Steve Irwin, the famous crocodile hunter, was killed by a sting ray, <a href="http://datamining.typepad.com/data_mining/2006/09/some_perspectiv.html">5.5% of the posts on that day in the blogosphere (September 05, 2006) mentioned his name</a>. Yesterday, mentions of ‘Michael Jackson’ topped out around 3%. Things are heading north from there today – currently around 8% – but note that the more immediate statistics come with a higher margin of error due to the lower sample size.</p><p><a href="http://datamining.typepad.com/.a/6a00d8341c994053ef01157162067d970b-pi"><img border="0" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" alt="mj" title="mj" width="500" src="http://datamining.typepad.com/.a/6a00d8341c994053ef011571620685970b-pi" height="300"></a></p><div class="feedflare"><a href="http://feeds.feedburner.com/~ff/DataMining?a=x7vHjOSDy9Y:VqX8uXApVzc:yIl2AUoC8zA"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=yIl2AUoC8zA"></a><a href="http://feeds.feedburner.com/~ff/DataMining?a=x7vHjOSDy9Y:VqX8uXApVzc:7Q72WNTAKBA"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=7Q72WNTAKBA"></a><a href="http://feeds.feedburner.com/~ff/DataMining?a=x7vHjOSDy9Y:VqX8uXApVzc:2mJPEYqXBVI"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=2mJPEYqXBVI"></a><a href="http://feeds.feedburner.com/~ff/DataMining?a=x7vHjOSDy9Y:VqX8uXApVzc:I9og5sOYxJI"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=I9og5sOYxJI"></a></div><img> ...]]></content:encoded><link>http://feedproxy.google.com/~r/DataMining/~3/x7vHjOSDy9Y/steve-irwin-michael-jackson.html</link><guid>http://datamining.typepad.com/data_mining/2009/06/steve-irwin-michael-jackson.html</guid><author>Matthew Hurst</author><pubDate>Fri, 26 Jun 2009 13:45:51 GMT</pubDate><source url="http://feeds.feedburner.com/DataMining">Data Mining: Text Mining, Visualization and Social Media</source></item>
<item>
<title>When Business Gets Too Personal</title><description><![CDATA[<p>Visualization guru <a href="http://www.perceptualedge.com/blog/?p=491" target="_blank">Stephen Few reminds us</a> that analyst opinions, while offered by recognized experts, are inherently personal, and that on the other side of the table, there are real people behind products, marketing campaigns, and corporate decisions. I'll amplify that each of us does bring unique personal experience and even personality to bear when reviewing (analysts) or promoting (vendors) products, and I'll agree that we should each be accountable for what we write or claim.</p>
<p>It's an analyst's personal perspective, coupled with strong judgment, communications skills, and fairness, that creates a sense of authority and makes his or her views worth reading. Good analysts don't blindly accept vendor claims. We investigate, and sometimes we reject what we've been told. But I disagree with Steve that analysts should always name names. Some situations become simply too personal. I and others I know have even been the subject vindictive behavior, which unhelpfully diverts attention from products to people. In the worst cases I've seen, the vendor can even exploit personal conflict to dismiss or attempt to denigrate the analyst. Among cases I've encountered recently, involving other analysts and myself.</p>]]></description><link>http://www.intelligententerprise.com/blog/archives/2009/06/when_business_g.html</link><guid>http://www.intelligententerprise.com/blog/archives/2009/06/when_business_g.html</guid><author></author><category>business intelligence</category><pubDate>Fri, 26 Jun 2009 10:50:48 GMT</pubDate><source url="http://www.intelligententerprise.com/blog/movabletype/index.xml&#x25;3Bjsessionid&#x25;3D2EFNX2OF5IGLAQSNDLQSKH0CJUNN2JVN">The Intelligent Enterprise Blog</source></item>
<item>
<title>Twitter Trending Terms &amp;ndash; Could Do Better</title><description><![CDATA[Twitter has oodles of data – millions of tweets a day. They have smart people working on this data, and they make all the right noises about social search. However, I’m looking at the trending topics on the site just...]]></description><content:encoded><![CDATA[<p>Twitter has oodles of data – millions of tweets a day. They have smart people working on this data, and they make all the right noises about social search. However, I’m looking at the trending topics on the site just now and I see these: MJ’s, Rip MJ, RIP Michael Jackson, Farrah Fawcett, #iranelection, Pop, Thriller, MTV, Iran, #michaeljackson.</p><p>I don’t get it. There are a number of problems here:</p><ol><li>These aren’t topics, they are words or phrases. There are only 4 topics present here (Michael Jackson’s death, Farrah Fawcett’s death, the Iranian Election and MTV).</li><li>The ‘phrases’ present in the terms are pretty lame: ‘MJ’s’? Earlier today the phrase ‘Did Michael Jackson’ was a trending topic.</li><li>There is no attempt at normalization (RIP MJ == RIP Michael Jackson)</li><li>They actually are not at all interesting – anyone out there not know about Michael Jackson, Farrah Fawcett or Iran? Perhaps the MTV thing is a little more obscure.</li><li>They aren’t trending – ok, the RIP stories are, but the Iranian election? that’s been top of mind for many days now.</li></ol><p>Twitter’s featuring of these topics and the immaturity of the technology in spite of the promise of the data does not paint a good picture for their prospects.</p><div class="feedflare"><a href="http://feeds.feedburner.com/~ff/DataMining?a=tgeYEjd_xZ0:ZVAHOoP23Sw:yIl2AUoC8zA"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=yIl2AUoC8zA"></a><a href="http://feeds.feedburner.com/~ff/DataMining?a=tgeYEjd_xZ0:ZVAHOoP23Sw:7Q72WNTAKBA"><img border="0" src="http://feeds.feedburner.com/~ff/DataMining?d=7Q72WNTAKBA"></a><a href="http://feeds.feedburner.com/~ff/DataMining?a=tgeYEjd_xZ0:ZVAHOoP23Sw:2mJPEYqXBVI"><img></a></div> ...]]></content:encoded><link>http://feedproxy.google.com/~r/DataMining/~3/tgeYEjd_xZ0/twitter-trending-terms-could-do-better.html</link><guid>http://datamining.typepad.com/data_mining/2009/06/twitter-trending-terms-could-do-better.html</guid><author>Matthew Hurst</author><category>twitter</category><pubDate>Fri, 26 Jun 2009 05:59:14 GMT</pubDate><source url="http://feeds.feedburner.com/DataMining">Data Mining: Text Mining, Visualization and Social Media</source></item>
</channel>
</rss>
