Archive for May 10th, 2009

Wolfram Alpha: objective anticipation + analysis, please

Sunday, May 10th, 2009

The latest information from Danny Sullivan at http://searchengineland.com/, whose insights I hold in good regard, is that Wolfram Alpha will officially go live on May 18. Please calendar mark this date and go to www.wolframalpha.com on that day to see where and how the engine works / could be done better for your needs. The company is marketing its product as a “computational knowledge engine”.

Already there’s been a certain amount of commentary around it (including, unfortunately, some hype from certain SemWeb individuals whose credibility is questionable since their own actions have earned them reputations for spin more than substance). However, trying to be objective about the commentary to-date, I’ll reserve my assessment of Wolfram Alpha as a knowledge tool until I actually get to plug+play with it — in the same way I did Powerset, True Knowledge, SearchMe, cuil and other supposed “Google killers” / “Next Big Thingammyjigammys”.

There’s nothing less smart and more irrational than to be swept along by vacuous spin and sheep/lemming-like behavior; ‘The Emperor’s New Clothes’ is a cautionary tale for us all and, in some instances, we may even lose our own shirts if we invest time, content or money in ventures without objectivity. We invest with good intent but, probably, insufficient anticipation that the company may be less than truthful and straightforward about what their product / service can and cannot do and how they will or will not deliver to users.

[This is why proper due diligence is SO important --- please also see the global financial crisis and the primary lessons there: due diligence, due diligence, due diligence.]

Returning to WA, so far this is what I’ve seen of the site itself; a preview request box that’s been poorly coded with CSS.

Hopefully, this is a itsy-teeny-weeny aberration on the part of WA developers and the knowledge engine itself is in synch, in dynamic stacks and properly aligned (data sets, surfacing of information and relational databases).

I’ve actually seen some screenshots of WA on ReadWriteWeb and first impressions are that the preview form is a minor coding mistake:

Plus I’ve watched the video from WA’s presentation at Harvard’s Berkman Center. It’s over an hour long, so please have some coffee / hot chocolate and snacks to hand before you venture into it:

Again this looks interesting.

Any platform which provides “show + tell” (aka visual representation) of a search term is helpful to users. Nevertheless, whether it’s as innovative or disruptive as some previewers suggest remains to be seen. Anyone who’s done any form of asset allocation or portfolio modeling like I have will be familiar with WA’s searchable datasets on the likes of Reuters, Bloomberg, Thomson Financial, Amadeus, Multex and most of the mgmt consultancy research providers (Datamonitor, Jupiter Media Matrix, McKinsey, META Group, Nielsens, and more) and inter-governmental agencies (SEC, FSA, Edgar Online, Office of National Statistics, etc.).

Most of this information is “locked down” and “silo source” or, at the very minimum, subject to an access/subscription fee typically in the US$ tens of thousands.

Looking at R/R/W’s screenshots and WA’s video, it would seem WA is attempting to pool and provide this information to the masses on a democratic and free basis in much the same way Wikipedia tries to provide information — albeit, it’s widely recognized Wikipedia faces questions about content accuracy and reputational issues since contributors can be anonymous, unverified and the system sometimes experiences hoax postings.

Certainly having searchable and showable information on the number of Internet users on a global breakdown basis on a single site and page will be welcomed by academic researchers, management consultants and equity analysts who need data points to extrapolate and project future growth trends and how this will impact on Internet revenue models.

The questions here would still include:

(1.) how clean, current and reliable is the data?

(2.) where is this data derived from? What are its primary sources? What processes have been applied to clean the data — e.g., have Durbin-Watson and White’s tests been run to generate confidence of accuracy and have anomalies been identified / removed?

(3.) when are data points updated? Continuous moving targets or defined end-of-a-period (e.g. at midnight, end of working week on Friday, end of calendar month, end of accounting year, etc.)?

This may seem like over-precision but I’ve sanity-checked a lot of data over the years. As someone with a maths degree (including a 99% score in my Probability + Stats exam), who applied econometrics to model the Tiger Economies and who deals with facts and figures all the time in strategy modeling, I’m all too aware of the potential outliers and impurities that can affect those facts, figures and generate what people sometimes term, “Lies, damn lies and statistics.”

This is not to say that if a site tells us the population of the world is approaching 7 billion, as the US Census Bureau does, that we should reject their extrapolations.

It’s simply about TRUST in the source, reputation of the source, credibility of the source and the probability of us relying on the source. These are core principles which apply in meatspace as much as Interspace environments, btw.

As for WA being able to calculate trigonometric integrals, well Mathematica which Stephen Wolfram conceived is a recognized leader in the academic and research worlds for being computationally rigorous as far as numerate and statistical inputs and derived outputs is concerned.

Whether this will translate into the worlds of semantics, vernacular, popular culture slang, knowledge wit and discerned meaning is an unknown at this time-point.

I just want to be able to test the platform on 18 May 2009 and experience what happens and what can be learnt from it, technically and visually.

I won’t be asking the WA system for answers to the following:

(1.) Who is the President of the US?

(2.) What is Paris?

(3.) Where are King Solomon’s mines / Atlantis / Xanadu / sunken gold-bearing galleon?

(4.) When will the end of the world be at nigh?

(5.) Why do we fall in love with some people and not others?

(6.) How many air molecules are in a telephone box?

[This is a question they ask in Oxbridge entry exams, for natural science courses, btw. You’re supposed to know your Brownian motions and apply the gas laws to derive the answers.]

Instead, I will be asking some questions of a probabilistic, interconnecting and predictive nature to test to what extent WA is sufficiently intelligent to plot holistic decision paths — or, at least, provide related data points with relevant velocity.

Re. cellular automata which underpins some of Mathematica, I’ve already commented it on my ‘The Global Brain’ knol.

So in conclusion…………let’s please watch out for 18 May 2009 when we get to test WA for ourselves, objectively.

SOME INTERESTING NEWS LINKS:

http://news.cnet.com/8301-17939_109-10233763-2.html

http://www.wired.com/epicenter/2009/05/how-the-wolfram-alpha-search-engine-could-save-google/

http://www.independent.co.uk/life-style/gadgets-and-tech/news/an-invention-that-could-change-the-internet-for-ever-1678109.html