ARTIFICIAL INTELLIGENCE
An AI-native strategy firmGlobal Advisors: a consulting leader in defining quantified strategy, decreasing uncertainty, improving decisions, achieving measureable results.
A Different Kind of Partner in an AI World
AI-native strategy
consulting
Experienced hires
We are hiring experienced top-tier strategy consultants
Quantified Strategy
Decreased uncertainty, improved decisions
Global Advisors is a leader in defining quantified strategies, decreasing uncertainty, improving decisions and achieving measureable results.
We specialise in providing highly-analytical data-driven recommendations in the face of significant uncertainty.
We utilise advanced predictive analytics to build robust strategies and enable our clients to make calculated decisions.
We support implementation of adaptive capability and capacity.
Our latest
Thoughts
Global Advisors’ Thoughts: Leading a deliberate life
By Marc Wilson
Marc is a partner at Global Advisors and based in Johannesburg, South Africa
Download this article at https://globaladvisors.biz/blog/2018/06/26/leading-a-deliberate-life/.
Picket fences. Family of four. Management position.
Mid-life crisis. Meaning. Purpose.
Someone once said that, “At 18, I had all the answers. At 35, I realised I didn’t know the question.”
Serendipity has a lot going for it. Many people might sail through life taking what comes and enjoying the moment. Others might be open to chance and have nothing go right for them.
Some people might strive to achieve, realise rare successes and be bitterly unhappy. Others might be driven and enjoy incredible success and fulfilment.
Perhaps the majority of us become beholden to the momentum of our lives.
We might study, start a career, marry, buy a dream house, have children, send them to a top school. Those steps make up components of many of our dreams. They are steps that may define each subsequent choice. As I discussed this with a friend recently, he remarked that few of these steps had been subject of deliberations in his life – increasingly these steps were the outcome of momentum. Each will shape every step he takes for the rest of his life. He would not have things any other way, but if he knew what he knows now, he might have been more deliberate about choice and consequence…..
Read more at https://globaladvisors.biz/blog/2018/06/26/leading-a-deliberate-life/
.
Strategy Tools
PODCAST: Strategy Tools: Growth, Profit or Returns?
Our Spotify podcast explores the relationship between Return on Net Assets (RONA) and growth, arguing that both are essential for shareholder value creation. The hosts contend that focusing solely on one metric can be detrimental, and propose a framework for evaluating business portfolios based on their RONA and growth profiles. This approach involves plotting business units on a “market-cap curve” to identify value-accretive and value-destructive segments.
The podcast also addresses the impact of economic downturns on portfolio management, suggesting strategies for both offensive and defensive approaches. The core argument is that companies should aim to achieve a balance between RONA and growth, acknowledging that both are essential for long-term shareholder value creation.
Read more from the original article – https://globaladvisors.biz/2020/08/04/strategy-tools-growth-profit-or-returns/

Fast Facts
Fast Fact: The rate of technology adoption exploded in the 1990s
The 1990s were an inflection point in the adoption of new technologies. While radio showed fast adoption in the 1920s, new technologies introduced post 2010 had reached penetrations of more than 30% of the United States population within 3 years from launch. PCs...
Selected News
Term: Reinforcement Learning (RL)
“Reinforcement Learning (RL) is a machine learning method where an agent learns optimal behavior through trial-and-error interactions with an environment, aiming to maximize a cumulative reward signal over time.” – Reinforcement Learning (RL)
Definition
Reinforcement Learning (RL) is a machine learning method in which an intelligent agent learns to make optimal decisions by interacting with a dynamic environment, receiving feedback in the form of rewards or penalties, and adjusting its behaviour to maximise cumulative rewards over time.1 Unlike supervised learning, which relies on labelled training data, RL enables systems to discover effective strategies through exploration and experience without explicit programming of desired outcomes.4
Core Principles
RL is fundamentally grounded in the concept of trial-and-error learning, mirroring how humans naturally acquire skills and knowledge.2 The approach is based on the Markov Decision Process (MDP), a mathematical framework that models decision-making through discrete time steps.8 At each step, the agent observes its current state, selects an action based on its policy, receives feedback from the environment, and updates its knowledge accordingly.1
Essential Components
Four core elements define any reinforcement learning system:
- Agent: The learning entity or autonomous system that makes decisions and takes actions.2
- Environment: The dynamic problem space containing variables, rules, boundary values, and valid actions with which the agent interacts.2
- Policy: A strategy or mapping that defines which action the agent should take in any given state, ranging from simple rules to complex computations.1
- Reward Signal: Positive, negative, or zero feedback values that guide the agent towards optimal behaviour and represent the goal of the learning problem.1
Additionally, a value function evaluates the long-term desirability of states by considering future outcomes, enabling agents to balance immediate gains against broader objectives.1 Some systems employ a model that simulates the environment to predict action consequences, facilitating planning and strategic foresight.1
Learning Mechanism
The RL process operates through iterative cycles of interaction. The agent observes its environment, executes an action according to its current policy, receives a reward or penalty, and updates its knowledge based on this feedback.1 Crucially, RL algorithms can handle delayed gratification-recognising that optimal long-term strategies may require short-term sacrifices or temporary penalties.2 The agent continuously balances exploration (attempting novel actions to discover new possibilities) with exploitation (leveraging known effective actions) to progressively improve cumulative rewards.1
Mathematical Foundation
The self-reinforcement algorithm updates a memory matrix according to the following routine at each iteration:
Given situation s, perform action a
Receive consequence situation s’
Compute state evaluation v(s') of the consequence situation
Update memory: w'(a,s) = w(a,s) + v(s')5
Practical Applications
RL has demonstrated transformative potential across multiple domains. Autonomous vehicles learn to navigate complex traffic environments by receiving rewards for safe driving behaviours and penalties for collisions or traffic violations.1 Game-playing AI systems, such as chess engines, learn winning strategies through repeated play and feedback on moves.3 Robotics applications leverage RL to develop complex motor skills, enabling robots to grasp objects, move efficiently, and perform delicate tasks in manufacturing, logistics, and healthcare settings.3
Distinction from Other Learning Paradigms
RL occupies a distinct position within machine learning’s three primary paradigms. Whereas supervised learning reduces errors between predicted and correct responses using labelled training data, and unsupervised learning identifies patterns in unlabelled data, RL relies on general evaluations of behaviour rather than explicit correct answers.4 This fundamental difference makes RL particularly suited to problems where optimal solutions are unknown a priori and must be discovered through environmental interaction.
Historical Context and Theoretical Foundations
Reinforcement learning emerged from psychological theories of animal learning and played pivotal roles in early artificial intelligence systems.4 The field has evolved to become one of the most powerful approaches for creating intelligent systems capable of solving complex, real-world problems in dynamic and uncertain environments.3
Related Theorist: Richard S. Sutton
Richard S. Sutton stands as one of the most influential figures in modern reinforcement learning theory and practice. Born in 1956, Sutton earned his PhD in computer science from the University of Massachusetts Amherst in 1984, where he worked alongside Andrew Barto-a collaboration that would fundamentally shape the field.
Sutton’s seminal contributions include the development of temporal-difference (TD) learning, a revolutionary algorithm that bridges classical conditioning from animal learning psychology with modern computational approaches. TD learning enables agents to learn from incomplete sequences of experience, updating value estimates based on predictions rather than waiting for final outcomes. This breakthrough proved instrumental in training the world-champion backgammon-playing program TD-Gammon in the early 1990s, demonstrating RL’s practical power.
In 1998, Sutton and Barto published Reinforcement Learning: An Introduction, which became the definitive textbook in the field.10 This work synthesised decades of research into a coherent framework, making RL accessible to researchers and practitioners worldwide. The book’s influence cannot be overstated-it established the mathematical foundations, terminology, and conceptual frameworks that continue to guide contemporary research.
Sutton’s career has spanned academia and industry, including positions at the University of Alberta and Google DeepMind. His work on policy gradient methods and actor-critic architectures provided theoretical underpinnings for deep reinforcement learning systems that achieved superhuman performance in complex domains. Beyond specific algorithms, Sutton championed the view that RL represents a fundamental principle of intelligence itself-that learning through interaction with environments is central to how intelligent systems, biological or artificial, acquire knowledge and capability.
His intellectual legacy extends beyond technical contributions. Sutton advocated for RL as a unifying framework for understanding intelligence, arguing that the reward signal represents the true objective of learning systems. This perspective has influenced how researchers conceptualise artificial intelligence, shifting focus from pattern recognition towards goal-directed behaviour and autonomous decision-making in uncertain environments.
References
1. https://www.geeksforgeeks.org/machine-learning/what-is-reinforcement-learning/
2. https://aws.amazon.com/what-is/reinforcement-learning/
3. https://cloud.google.com/discover/what-is-reinforcement-learning
4. https://cacm.acm.org/federal-funding-of-academic-research/rediscovering-reinforcement-learning/
5. https://en.wikipedia.org/wiki/Reinforcement_learning
7. https://www.mathworks.com/discovery/reinforcement-learning.html
8. https://en.wikipedia.org/wiki/Machine_learning
9. https://www.ibm.com/think/topics/reinforcement-learning
10. https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

Polls
No Results Found
The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.
Services
Global Advisors is different
We help clients to measurably improve strategic decision-making and the results they achieve through defining clearly prioritised choices, reducing uncertainty, winning hearts and minds and partnering to deliver.
Our difference is embodied in our team. Our values define us.
Corporate portfolio strategy
Define optimal business portfolios aligned with investor expectations
BUSINESS UNIT STRATEGY
Define how to win against competitors
Reach full potential
Understand your business’ core, reach full potential and grow into optimal adjacencies
Deal advisory
M&A, due diligence, deal structuring, balance sheet optimisation
Global Advisors Digital Data Analytics
14 years of quantitative and data science experience
An enabler to delivering quantified strategy and accelerated implementation
Digital enablement, acceleration and data science
Leading-edge data science and digital skills
Experts in large data processing, analytics and data visualisation
Developers of digital proof-of-concepts
An accelerator for Global Advisors and our clients
Join Global Advisors
We hire and grow amazing people
Consultants join our firm based on a fit with our values, culture and vision. They believe in and are excited by our differentiated approach. They realise that working on our clients’ most important projects is a privilege. While the problems we solve are strategic to clients, consultants recognise that solutions primarily require hard work – rigorous and thorough analysis, partnering with client team members to overcome political and emotional obstacles, and a large investment in knowledge development and self-growth.
Get In Touch
16th Floor, The Forum, 2 Maude Street, Sandton, Johannesburg, South Africa
+27114616371
