Predictive Analytics or Cognitive Anticipation: What’s in Another Word? – Part 2
By Dr. Manuel Aparicio IV
When old words get stale and overloaded, new words can help reinvigorate our thinking. Summarizing the references on anticipation from my last post, I think there are at least three elements of “anticipation” that help us push away from more common “prediction”:
- Time is fundamental. Beyond input-output classification of a current state, time is fundamental to anticipatory representation and inference. Classifying what a thing current is does not imply the full meaning of prediction as something expressly about the future. Classification and other methods are still fundamental, but lag times and sequence representations, for example, are also included. The inference is about what will happen next, at T+1 or in some actionable future, where high expectation meets the ability to change or prepare for the future. True anticipation estimates when something will occur, beyond only that it might occur. More subtly, anticipation optimizes learning and its value over the long-term, considering multiple options and futures. As well, future events are causal; the cognitive process of expecting future events (like a scheduled meeting) leads to other expectations when the future is included as part of the current context.
- Learning is constant. Predictive modeling is typically a process of data modeling, which fits a mathematical function to the data in a train-test-deploy procedure, assuming the world doesn’t change after deployment. In a non-stationary world (the real world), incremental learning is required. Furthermore, in the definition of anticipation, incremental means instant. From a brain-like computing perspective, we don’t build and deploy fixed models. We learn all the time, with expectations and un-anticipated novelties driving our attention as known in the Psychology of Learning. We set expectations and then learn most about the world when it does not match those expectations. We attend to novelty, constantly learning more about the world that might gain us more control and preparedness for the world. Optimal unlearning and re-learning (what psychology calls extinction and spontaneous recovery) are also required, such as to track through rapid changes of consumer taste or the economic environment.
- Fuzziness is acknowledged. The history of predictive analytics includes the history of Decision Theory and its measurement of “hits” and “misses”. Born out of World War II, the problem was to determine the cost/benefit of firing bullets. Hit or miss. Classification inferences must similarly commit to “firing” and answer and then suffer a miss if wrong. However, more modern measurements of Information Retrieval admit personal and relative relevance. A document is ranked as more or less relevant, not a hit or miss, ultimately defined as whether it is relevant to you and your personal meaning. Unlike AI logic and its assumption of “facts” to give right or wrong answers no matter who is asking, real world knowledge and real answers are more complex and inter-connected. As cognitive beings, we sometimes want or need to stay close to “home” with what we know, other times wanting or needing to explore and learn more. More modern approaches also distinguish classification from categorization, the latter being more contextual and creative such as when a plane (class vehicle) can be used as a bomb (class weapon). Conceptual structures need to be much more fluid, as we learning from our “failure of imagination” on September 11, 2001.
When we think this way, we think more creatively and long term. I remember working with Dan Ariely on advanced personalization systems comparing individual versus collaborative inferencing when faced with new product arrivals and consumer taste changes. As now famously known from his books, Dan has a beautiful way with words. I remember his warning, “The best set of recommendations is not the set of best recommendations.” For example, if I know that you love expensive Merlo wines from California, I should NOT give you five recommendations of expensive Merlos from California. There is no variety; no exploration; no choice in the choice. I could maximize my likelihood of a “hit” on this single transaction, but I would do better to maximize my knowledge of you and your knowledge of my catalogue for the long term. I should include a Californian Merlo or two, of course, but also offer, “I think you might like to explore something new I just received. Let me know how you like it.” Rather than ensure the prediction of a single purchase, I will do better in developing your life-time value as a customer by exploring. Although the distinction is subtle, I will do better to anticipate what you might like rather than predict what you will like. As a criticism against predictive analytics, your tastes will change. If I can learn on the fly (my own brain can, as can Saffron), then these opportunities allow me to learn more about you. If you can learn on the fly (you can), you might find a new taste sensation and remember me for not just my new product but for my service in giving you a real choice.
Saffron provides “connect the dots” anticipatory sense making as well as “illuminate the dots” predictive/anticipatory analytics. Our instant learning, nonlinear, high dimensional, non-parametric classifier has always been at our product’s core. Also due to Dan, we had the pleasure of working with the founders of OpenField Software (see them now at ProductOps), who developed Electronic Learning Assistant (ELLA) for personalized email filtering back in 2002. You need The Wayback Machine to read the glowing reviews at the time, including results of a competitive shootout against 8 competitors (using Bayesian, collaborative, rules, and other methods). ELLA was named “World’s Best Spam Blocker” by PC User Magazine. User reviews hailed it as near flawless, quickly learning on few examples and instantly adapting to changes to email use and the environment of spam and other “incoming”.
As shared at Predictive and Text Analytics World, Saffron’s predictive methods continue to be applied to Threat Scoring for the Bill and Melinda Gates Foundation and Condition Based Maintenance for a Global Fortune 100 Manufacturer, demonstrating 100% recall with only a 1% false alarm rate. More recent work with Dr. Partho Segupta at Mt. Sinai Hospital diagnosed heart disease in high dimension, nonlinear echocardiogram data from relatively few examples – and no parameter tuning. See Dr. Segupta’s astounding ASE Feigenbaum Lecture on Big Data and the need for such newer analytic methods in healthcare.
Saffron does prediction. But more toward the time-based definition of anticipation, SaffronMemoryBase has also been successful in predicting the survivability of liver cancer based on a tumor’s mRNA expressions (a highly non-linear control network of all other gene expressions). Looking into the future over more than a year’s time frame, Saffron correctly predicted 90% of patient survival times to the exact week – to the exact week. But such prognostic precision is depressing. We are moving on to the anticipation of treatment options that anticipate better prognoses. Saffron will be further addressing personalization, both for general consumers and healthcare consumers, so watch this space.
Prediction is hot! It seems wrong to say “Prediction is dead” without adding “Long live Prediction!” – if we mean new approaches to prediction as Dr. Hofmann hinted at Text Analytics World last week. More will be said about Saffron’s underlying approach in the near future. The fundamental requirements from classifying diseases to predicting customer churn remain. But anticipation leads to a richer and more cognitive theory of machine learning. Rather than fixed predictive models, an anticipatory system is always exploring and adapting. Our brains are constantly anticipating the next time step and other future time frames, constantly evaluating many contingencies, and if wrong or merely curious, constantly and instantly learning more.
Anticipation is the future.