Although this is a somewhat philosophical, epistemological post 360-2020 is DEFINITELY AIMED AT COMMERCIAL MARKET rather than as an exercise in theory. In the big bank I was designated a “revenue generator” which is something that happens to less than 1% of employees. We’re the ones who have direct impact on balance sheets, P+L and cashflows; in my case because of the Strategic Investments portfolio. Other people who get designated “revenue generator” are the rainmakers and dealflow corporate financiers who are often mentioned in the media. I am definitely someone who believes in and delivers on for-profit models, albeit I also believe there’s a way to do reciprocity of reward towards the user community and wider society than current for-profit capitalistic models allow.
Now I think it’s helpful for commercially-oriented entrepreneurs and innovators to have a depth and width of context and intelligence about why they’re doing what they are, how their experiences inform their approach and route to market, and also it simply means that they’re not building their ventures in myopic isolation but with an awareness of multiple dynamics. This is why I’m sharing in this post.
David Price, the brilliant co-founder of debategraph (which is a wiki for open collaboration and does decision beading on a superior quality basis), recently flagged me about Gregory Bateson (the Cambridge-educated cyberneticist and semiotician’s) definition of information as: “a difference which makes a difference“. Thanks to David I’m now newly introduced to Bateson’s attempts to oppose scientists who tried to ‘reduce’ everything to mere matter. His life’s work was dedicated to re-introducing ‘Mind’ back into the scientific equations as postulated in his seminal books Steps to an Ecology of Mind and Mind & Nature. He also wrote Number is Different from Quantity.
Completely separately from any prior exposure to Bateson I’ve found all sorts of issues involved with the way we classify, reduce and treat information in the social sciences, in the natural sciences and importantly……in the machines which govern our online lives. This has arisen from:
* an increasing awareness that a wide range of online tools are non-optimal at best and primitive at worse (5 star ratings being a case in point);
* reading Professor Vedral of Oxford University definition of information as “Once you have a probability that something might happen, then you can define information. And it’s the same information in physics, in thermodynamics, in economics,” (more on this later); and
* my personal realization about how human subjectivity (desires, motivations, perceptions, tastes, morals, emotions, wit, etc.) have been removed from the great economic equations of Smith, Nash, Black-Scholes, Mundell-Fleming and more.
What happens is that in the text assumptions, there are some qualifier references such as how “humans are rational, there is access to equal and perfect information and the market is stochastic / asymmetric.” However, in the functional equations themselves the humanistic moral-emotional dimensions of our nature are simply not captured or explicated. This is because to-date natural and social scientists alike have not been able to quantify what are qualitative elements (human subjectivity and irrationality, over time). So our machines — which are run on deterministic and probabilistic models — can only provide at most a 50% insight into who we are, what motivates and drives us, what we like to buy, how we imagine and aspire our lives, our relationships and our societies to work and be, etc.
This is part of what I define as the “Probability Paradox of Non-Quantifiable Value.” Gregory Bateson wrote:
Numbers are the product of counting. Quantities are the product of measurement. This means that numbers can conceivably be accurate because there is a discontinuity between each integer and the next. Between two and three there is a jump. In the case of quantity there is no such jump, and because jump is missing in the world of quantity it is impossible for any quantity to be exact. You can have exactly three tomatoes. You can never have exactly three gallons of water. Always quantity is approximate.
I’d go further and say that frequency, which underpins most search algorithms, is a product of counting the number of times a link or data object is referenced across the Web and that quantity as a product of measurement is insufficient. After all, is it possible on social networks to size up or measure our respect / affection / tolerance / humor / taste preferences etc. for either another person we’re connected to or for the content that streams in real-time in our interest feeds?
No, it is not. This is because there are vital elements missing from the information that’s being processed by the machines in the first place.
TWAIN’S DEFINITION OF INFORMATION
Information is a consciousness of quantity and quality that enables differentiation and contextualization over time.
WE ARE ON A QUANTUM ADVENTURE WITH 360-2020
This post provides some insights on my life journey towards 360-2020, the way my knowhow and skills have been shaped by my Chinese heritage and Western experiences and why I decided to tackle the 5-fold challenges of:
(1.) replacing the 5-star rating system which is known to skew and be loaded at the extremes of 1 and 5;
(2.) coding a missing sequence of the Semantic Web stack;
(3.) inventing some recommendation algorithms that are neither exclusively deterministic nor probabilistic dependent;
(4.) dealing with the issue of asymmetric and imperfect information which is making our economic models less optimal and efficient than they should be (and which are the source of the global financial crisis); and
(5.) shoot for Mars and try to be a quantum bridge between Babbage’s difference engine and passing Turing’s test for machine intelligence.
Isn’t this overly ambitious? No, not if pragmatism and experiential insights means that you can identify that the main underlying issues are persistent and overlap, and that we need to target the crux of the problem rather than the peripherals and that we need a whole new set of solution tools. It’s like this: if someone didn’t invent the scales, we would never be able to measure weight with the ruler and mass (m) in Newton’s laws of motions and Einstein’s E=mC(squared) would not exist. We would also be unable to dimensionalize objects and discover something called density as in BMI.
So the existing online tools for measurement and quantification could be likened to being rulers but not scales or spectroscopes or NMR (nuclear magnetic resonance). People are prospecting for digital gold armed with tools which are not the equivalent of heat & motion sensors or material detectors.
Okay, so we’re going to approach digital prospecting differently……….360-2020 is designed as a one-stop, super-smart differentiation solution (quant and qualitative) to solve some important issues that reflect the current limitations of the Web and indeed the way we treat information in machines and economic models. There will be machine learning and the application of code (Actionscript, AJAX, JSON, SVGML, PHP, PERL, Seagull, Ruby-on-Rails, smalltalk, natural language, semantics, games and more) like it has never been configured before.
It’s never been more obvious to me that we seriously need to re-think what information is going into our algorithms and how that information is being processed than when I read the interview with Professor Vedral, a quantum physicist from Oxford University, who believes: “When you strip out all the unnecessary baggage, at the core is the concept of probability………Once you have a probability that something might happen, then you can define information. And it’s the same information in physics, in thermodynamics, in economics.”
His interview can be read here:
* http://www.guardian.co.uk/science/2010/mar/07/vlatko-vedral-interview-aleks-krotoski
It made me think that what’s happening in our bank risk management systems, recommendation systems, search algorithms etc. is that we’ve reduced all information to series of binary 0s and 1s so that they conform with probability calculations and that none of our humanistic emotional-moral-conscious and qualifying information are being adequately or explicitly captured. This is why the existing algorithms are inherently flawed; they’re not allowing for any subjectivity or human ambiguity, only pure logic to process 0s and 1s.
Gregory Bateson — if he had lived in the WWW era (alas he passed away in 1980) — would probably take objection to this continued reduction of everything into mere matter and the ignoring of the “mind” (to which I’d add chemical emotion) in the algorithmic and scientific constructs that form the basis of computational mathematics.
Not to mention the fact that, as I noted elsewhere, thermodynamics and physics are phenomenon of NATURAL LAW and OBJECTIVE scientific reductionism is supposed to be applied to arrive at the ultimate proofs of concept whilst economics is a MAN-MADE MARKET. That’s why economics is called a SOCIAL SCIENCE because there is a subjectivity involved in all market transactions; that subjectivity arising from emotion, ego and evocation by advertising.
Comparing the information in those three disciplines, on a par, as if information is equivalent and commutative (or in simple terms, all of them are apples rather than some are apples and others are animals) breaches some core principles of scientific analysis and mathematical discipline. Opinion then becomes fact without empirical rigor, soundness or coherency. Ergo, for an Oxford don like Professor Vedral to say, “And it’s the same information in physics, in thermodynamics, in economics,” and not make that delineation between NATURAL SCIENCE and SOCIAL SCIENCE and thereby inadvertently abjugate the contributions of human emotions and ego to a man-made creation that is economics is another reason someone like me (a regular person, not a Professor) needs to highlight it on the Web and try to build 360-2020 precisely so that those missing humanistic elements ARE captured A PRIORI to machine processing and in a format that is understandable to the machines.
Our human brains and language skills can contextualize but certainly the machines (the natural language, the AI, the semantic bots) can’t, and yet paradoxically these are the very tools indexing, browsing, crawling etc. our information online, categorizing it all and serving it back to us as “This is the prioritized list of links you should click.”
!!!!!!!!!!!!!!!!!!!!!!!! That’s the astounding situation !!!!!!!!!!!!!!!!!!!!
So already the flawed machines and their ignorant, unconscious, sub-optimal and incomplete codes are controlling us via the quality of information flowing towards us from search algorithms and recommendation systems — we, the conscious organic moral species are not receiving the quality of information that will enable us to become more intelligent or more capable of making sense because…………the machines can’t make sense!!!
Interestingly, when I was in Madrid last year a Spanish-Russian astrophysicist with over 40 years experience said I was a “breathe of fresh air” to even have the audacity to imagine and approach 360-2020 in the way I am — given the dogmatic orthodoxy that now affects quantum physics and scientific methodologies generally . He was also a great sanity-check when he pointed out that we have difficulty defining ‘time” without using the words “duration”, “period” or “elapse” and yet there are a whole raft of function calculations of the form F(t) and the use of time either in some integral or differential as an a prior assumption. In fact, he observed we haven’t even written the definitive proof for time and yet we use it in all our mathematical assumptions and decision outcome calculations!
This takes us to another paradox as it relates to decision engines……..If like with time, we can’t define or differentiate information beyond either a priori deterministic assumption that it exists, absolute -1 / 0/+1 quantities, probability of proximity or topic clustering and we can’t define how to capture the QUALITATIVE elements or which qualifiers to capture, then how can we determine that our decisions are properly differentiated, contextualized, coherent and make sense?
HOPE IS ON THE HORIZON: 360-2020
At the weekend one of my best friends, 兔, called me from New York. She’s a VP at a bank and works with technology every day. She’s essentially responsible for the smooth functioning of the electronic servers that process financial trades, troubleshoots system bugs and architects operational solutions. Anyway, I updated her on where I am with 360-2020 — she’s one of the rare people who’s actually played with the UI and the ratings tools — and she says that it’s “granular and captures the underlying motivations of why someone is looking for something online and what their tastes are”.
Another American friend recently wrote, “You always choose your words with optical precision,” which is wonderfully supportive, even though he wasn’t speaking specifically about 360-2020. It made me smile, pause a while and realize how appropriate it is that I trademarked my solution set 360-2020®. This is because we should not only assess a situation with 360-degree perspective we should also try to see it with 20/20 clarity of vision, informed by our ever-evolving subjective tastes and perceptions.
Moreover, there’s no point being able to see 360 degrees if we can’t identify or define what it is we’re looking for and at in the first place! For example, if “Aga” is not in our lexicon then we’re going to have no idea it’s a type of cooker (and the machine learning algorithms are going to struggle to id it too). My mother would definitely struggle to identify one – LOL. Conversely, she can tell you what a “电饭锅” is and you’d have no idea if you didn’t know any Chinese or didn’t cook rice with it; it’s a rice cooker. Furthermore, the Aga operates with gas whilst the rice cooker is electronic, has more confined heat distribution and the Aga requires more space in your kitchen. With my Chinese hat on, I can also say that purely from the pictographic form “电饭锅” we know it’s electronic, it’s concerned with cooked rice rather than raw rice which is 米 and that this is a pot-machine made of metal. Meanwhile, the word “Aga” tells us almost nothing directly. Yes, there may be implicit associative connotations for those who own an Aga like “English country kitchen, Elizabeth David, homemade bread, etc” but in the word itself there are no axiomatic clues. By comparison, in Chinese characters, we get all sorts of explicit information in the radical compositions like what material something is made of, whether it’s associated with fire / water / wood / earth / air / speech / people / motion / one of the senses etc, its function and sometimes even its color.
Now………these are actually DIFFERENTIATION OBJECTS in our brains rather than decision points or binomial difference or even structured semantic data objects. There is no -1 or +1 between the Aga and the rice cooker and the summation of them algebraically doesn’t result in -1 / 0 / +1. There is also no semantic classification of them purely on a noun basis. Actually, there is a differentiation between their functions, power sources, material composition, heat distribution, space capacity, appeal and the foodstuffs being cooked which……….
SHOULD NOT AND CANNOT BE SIMULATED ON A SOCIAL GRAPH PLOTTING THE LINEAR REGRESSION RELATIONSHIP BETWEEN THE AGA AND THE RICE COOKER.
It’s somewhat silly to do this since it doesn’t matter and is of negligible qualitative value what probability it is that the Aga and the rice cooker is similar. We’re actually seeking to define their parameters of differentiation not the probability of their similarity. See what I mean? Ask the right questions and we get closer to finding the right answers.
Now extend that from the Aga and the rice cooker to John and Jane, social networks of interesting content and recommendation systems…….We soon realize where the flaws and limitations in existing algorithms are…………Yes and I include existing machine learning that directs questions of a “who, what, when, where, why” nature. They’re a move in the right direction but still not differentiating enough.
I know this because these are examples of how how I did differentiation metrics back in 2004 (and indeed before then):





Plus I worked in a hedge fund where we had over 5 models (neural nets, adaptive lag, linear regression, natural language etc.) so I have some sense of what information is inputted into various algorithms and how in today’s landscape of sentiment, decision and semantic engines (Google, Bing, Twitrratr, Facebook, Hunch, Nielsen Buzzmetrics et al)……..
EXISTING METRICS ARE NOT DIFFERENTIATING, THEY ARE DIFFERENCING.
DIFFERENTIATION: HOW CRITICAL IT IS FOR CHINESE BAMBINI
My mother says I could speak my first word when I was 6 months old. Readers need to be aware that it’s critically important for the Chinese baby to differentiate between multiple tones that can be applied to the vowels and these tones sound like notes of the musical scale. It’s not the same as when an English-speaking child pronounces “cut” instead of “cat” or “cot”, btw. Neither is it the same as when a French-speaking child can’t distinguish between the a in “chat” (cat) and “château” (castle).
In Chinese, the word “ma” (mother , 妈) completely changes both its meaning and the pictogram depending on the tone that’s applied to the “a” (acute, grave, bas, etc.). For those with an international keyboard, change the setting to ITABC and type out “ma” and we get these 17 options:
• 吗, 妈, 马, 嘛, 麻, 骂, 抹, 码, 玛, 蚂, 蚂, 摩, 唛, 杩, 嬷, 犸, 蟆
Only one of them is mother (妈). Amongst the other “ma” are a rhetorical inflexion at the end of a sentence to convert it into a question; a horse; dense (as in material); to grasp a handful (of something like sand); a character compound which makes up the word “careless, messy and lacking attention” and other meanings. So if the Chinese baby can’t differentiate — not merely tell the difference — then they can’t decide whether to call their mothers correctly with the right tone on the a or call their mothers “horses (马)”.
ERGO =⇒ WE NEED TO BE ABLE TO DIFFERENTIATE………BEFORE WE CAN DECIDE.
Now the fact is that currently a lot of so-called decision engines are not able to properly differentiate. Sure, they can count the difference between the frequency of a link or data object being embedded in the html and the probability of their proximity. Even with the semantic web stack, these so-called decision engines or recommendation systems can only — at the most — contextually classify the data object as either a person, a GPS co-ordinate, a company or a topic and cluster them. Unlike with Chinese, they wouldn’t be able to differentiate that that data object person designation is female.
Remember that character for mother, 妈? Well… ….The left radical is the character for “female, woman”, the right radical is the character for “horse” — yes, English punning and wordplay could derive the term “brood mare” from this Chinese construction for “mother”. It may be interesting to note that the word for safety and peace is 安 which has the female character, 女, under the radical for roof, 宀. So under our mothers’ protection we are safe and have sanctuary; that’s how poetic and rich with meaning the Chinese language is. Not for the Chinese any connotations of woman as the source of original sin à la that femme fatale, Eve. The Chinese’s axiomatic references to women (or female) are all positive and involve harmony, prosperity and goodness. In fact the character for “good” is 好 which has the female, 女, radical on the left and the radical for 子 on the right. 子 means “son / daughter / child; person; ancient title for a learned or virtuous person; seed; egg; the first of the 12 Earthly branches).
Importantly, it becomes clear that Chinese babies who grow up to be polyglot technologists deal with semantics on whole other scales and dimensions.
Amalgamate this necessary early linguistic differentiation of my Chinese heritage with life-to-date experiences and knowhow mean that I leverage, contextualize and differentiate complex, stochastic and multi-disciplinary information differently from most norms. If a person knows where the synchs are and how to adapt tools from one discipline to unlock the answers for another discipline, it becomes possible to hybridize solutions or to at least catalyze their discovery.
Hence the sparking of 360-2020.
Do I believe it’s possible to train a machine to differentiate the way a human can naturally? YES. It’s unlikely to be an exact replication of our brains but we should bear in mind that there are variations from one brain to another’s ability to differentiate in the first place. Would someone who wasn’t Chinese be able to differentiate between the various “ma”? Would I be able to differentiate between the Russian character for mother from horse? See so we need to accept that variation exists and machine simulation will never exactly match the way our brains operate naturally.
Still, there are codes and frameworks that we can invent and apply to make the machines smarter and more capable of capturing and processing qualitative and not only quantitative input — to, essentially, contextualize beyond the binary 0s and 1s and data objects.
To construct………DIFFERENTIATION OBJECTS THAT CARRY QUANTITATIVE-QUALITATIVE-TIME ELEMENTS EXPLICITLY.
Now this requires collective imagination, cross-pollination, pragmatism, tenacity and audacity; evolving the existing economic-mathematical-computational models which are either predicated on deterministic or probabilistic methodologies; and *magic* collaborative code hands.
Ah and in case readers wonder whether I have any clue about the maths behind search algorithms, recommendation systems and translation software…………….. At university I got 99% in my Probability & Statistics exam, a 1st in my Linear Methods and Operational Research exams, a 1st for my Econometrics paper on the South-East Asian “tiger economies” and I did a fair amount of statistical programming. Given these insights on how information is sliced, diced and presented, let’s quote Mark Twain’s attribution to the C19th British Prime Minister, Benjamin Disraeli:
“There are three kinds of lies: lies, damned lies, and statistics.”
Never has this been truer than during the global financial crisis. Without exception, the forecasting and risk management systems are predicated on probability models — remember, I worked in finance so I have a grasp of this. If they’re so smart why did the crisis happen and how is it possible the global bailout is US$ trillions not billions, according to an report from OECD Insights that was tweeted? Moreover, if probability is the solution to calculating all our human transactions and engagements then let’s ask three simple questions:
(1.) Why can’t probability calculate who we fall in love with or why we love some people like our parents but not others?
(2.) Why can’t probability pinpoint our morals, consciousness and emotional evolution over time that drives our quantitative purchases?
(3.) Why can’t probability explain altruism, philanthropy, religion, humor or humanity?
So these are examples of what I termed the “Probability Paradox of Non-Quantifiable Value” in a previous post. It’s not a trivial challenge to construct new algorithms that will train machines to differentiate qualitative attributes rather than to do standard decision tree binomials of quantities or difference of the frequency of incidences — as currently happens in recommendation systems and search engines.
My good friend GC says I am “a deep thinker and a perfectionist” and 360-2020 is the emerging distillation of that deep thinking: a differentiation solution. Incidentally, we’ll be constructing a seriously differentiated business model with symbiotic board structures for investors, management and the user community whilst we’re @T it.
:*).
360-2020 TAKES ON TURING
Yes and I will bet a can of Coke that the realization of 360-2020 will also crack the Turing Test. By now readers know that in 1950 Turing posed the question, “Can machines think?” whilst I believe the more interesting question is “Can machines make sense?”
In my reasoning from another post I wrote:
To date in IT development (including the Semantic Web), the definition of thinking machines or smart systems is predicated on their abilities to do the following:
· link (as in hyper-text)
· connect (as in social nets)
· compute / calculate (as in Deep Blue and Wolfram Alpha)
· choose (as in what to display at a specific time-geolocation)
· sort, filter and prioritize (as in eBay lists of items)
· rank (as in YouTube videos) · re-direct (as in cookies in browsers)
· visually represent (as in Flickr on Google Maps)
· synch (as in iPhone with iTunes store and Apple Macbooks)
· stream (as in videos and IM channels)
· semantic structuring (as in Powerset, True Knowledge, Adaptive Blue)
· recommend (as in LinkedIn, Amazon, Friendfeed and socnets)
Now, some of us would argue that all of those attributes are the same as thinking so if a machine can do those things then it must be as — or even more than — intelligent as a human. Evidently, this isn’t the case yet; no machine (including Elbot whom I had fun+games with) has even passed the Turing test much less tests where a robot can make sense the way we do with touch, taste, sight, hearing and smelling abilities to complement our neural, moral, memory, humor and relativism consciousness. We’re several years from The Terminator and Skynet (aka “The Cloud”).
Personally, I don’t want machines to be able to simply think. I want them to be able to MAKE SENSE.
If we look at ourselves as a species, 99 percent of us can think (some form of brain activity / electrical impulses) with less than 1 percent of us incapable of thought because of coma or brain damage. However, not all 99 percent of us are making sense.
Now we need to ask the question, “How do humans make sense?” It’s not by calculating differences between 0s and 1s or traveling down probability decision trees or topic clustering alone. It’s by differentiating with our language, cultural influences, emotions, morals, values, perceptions, humor, sensory intake and more.
Our consciousness.
Remember the Twain definition of information: Information is a consciousness of quantity and quality that enables differentiation and contextualization over time.
So 360-2020 may prove to be the quantum leap between Babbage’s difference engine and Turing’s test for machine intelligence.
Imagine that………………..Twain is………………………:*)
********************************************************
********************************************************
DEBATEGRAPH
They’re constantly doing interesting and innovative things to make sense of information in a collaborative way, so if readers haven’t checked it out here are some examples of important issues they’re tackling at debategraph.