Select Page

ARTIFICIAL INTELLIGENCE

An AI-native strategy firm

Global Advisors: a consulting leader in defining quantified strategy, decreasing uncertainty, improving decisions, achieving measureable results.

Learn MoreGlobal Advisors AI

A Different Kind of Partner in an AI World

AI-native strategy
consulting

Experienced hires

We are hiring experienced top-tier strategy consultants

Quantified Strategy

Decreased uncertainty, improved decisions

Global Advisors is a leader in defining quantified strategies, decreasing uncertainty, improving decisions and achieving measureable results.

We specialise in providing highly-analytical data-driven recommendations in the face of significant uncertainty.

We utilise advanced predictive analytics to build robust strategies and enable our clients to make calculated decisions.

We support implementation of adaptive capability and capacity.

Our latest

Thoughts

Global Advisors’ Thoughts: Leading a deliberate life

Global Advisors’ Thoughts: Leading a deliberate life

By Marc Wilson
Marc is a partner at Global Advisors and based in Johannesburg, South Africa

Download this article at https://globaladvisors.biz/blog/2018/06/26/leading-a-deliberate-life/.

Picket fences. Family of four. Management position.

Mid-life crisis. Meaning. Purpose.

Someone once said that, “At 18, I had all the answers. At 35, I realised I didn’t know the question.”

Serendipity has a lot going for it. Many people might sail through life taking what comes and enjoying the moment. Others might be open to chance and have nothing go right for them.

Some people might strive to achieve, realise rare successes and be bitterly unhappy. Others might be driven and enjoy incredible success and fulfilment.

Perhaps the majority of us become beholden to the momentum of our lives.

We might study, start a career, marry, buy a dream house, have children, send them to a top school. Those steps make up components of many of our dreams. They are steps that may define each subsequent choice. As I discussed this with a friend recently, he remarked that few of these steps had been subject of deliberations in his life – increasingly these steps were the outcome of momentum. Each will shape every step he takes for the rest of his life. He would not have things any other way, but if he knew what he knows now, he might have been more deliberate about choice and consequence…..

Read more at https://globaladvisors.biz/blog/2018/06/26/leading-a-deliberate-life/

.

read more

Strategy Tools

PODCAST: Strategy Tools: Growth, Profit or Returns?

PODCAST: Strategy Tools: Growth, Profit or Returns?

Our Spotify podcast explores the relationship between Return on Net Assets (RONA) and growth, arguing that both are essential for shareholder value creation. The hosts contend that focusing solely on one metric can be detrimental, and propose a framework for evaluating business portfolios based on their RONA and growth profiles. This approach involves plotting business units on a “market-cap curve” to identify value-accretive and value-destructive segments.

The podcast also addresses the impact of economic downturns on portfolio management, suggesting strategies for both offensive and defensive approaches. The core argument is that companies should aim to achieve a balance between RONA and growth, acknowledging that both are essential for long-term shareholder value creation.

Read more from the original article – https://globaladvisors.biz/2020/08/04/strategy-tools-growth-profit-or-returns/

read more

Fast Facts

Selected News

Term: Reinforcement Learning (RL)

Term: Reinforcement Learning (RL)

“Reinforcement Learning (RL) is a machine learning method where an agent learns optimal behavior through trial-and-error interactions with an environment, aiming to maximize a cumulative reward signal over time.” – Reinforcement Learning (RL)

Definition

Reinforcement Learning (RL) is a machine learning method in which an intelligent agent learns to make optimal decisions by interacting with a dynamic environment, receiving feedback in the form of rewards or penalties, and adjusting its behaviour to maximise cumulative rewards over time.1 Unlike supervised learning, which relies on labelled training data, RL enables systems to discover effective strategies through exploration and experience without explicit programming of desired outcomes.4

Core Principles

RL is fundamentally grounded in the concept of trial-and-error learning, mirroring how humans naturally acquire skills and knowledge.2 The approach is based on the Markov Decision Process (MDP), a mathematical framework that models decision-making through discrete time steps.8 At each step, the agent observes its current state, selects an action based on its policy, receives feedback from the environment, and updates its knowledge accordingly.1

Essential Components

Four core elements define any reinforcement learning system:

  • Agent: The learning entity or autonomous system that makes decisions and takes actions.2
  • Environment: The dynamic problem space containing variables, rules, boundary values, and valid actions with which the agent interacts.2
  • Policy: A strategy or mapping that defines which action the agent should take in any given state, ranging from simple rules to complex computations.1
  • Reward Signal: Positive, negative, or zero feedback values that guide the agent towards optimal behaviour and represent the goal of the learning problem.1

Additionally, a value function evaluates the long-term desirability of states by considering future outcomes, enabling agents to balance immediate gains against broader objectives.1 Some systems employ a model that simulates the environment to predict action consequences, facilitating planning and strategic foresight.1

Learning Mechanism

The RL process operates through iterative cycles of interaction. The agent observes its environment, executes an action according to its current policy, receives a reward or penalty, and updates its knowledge based on this feedback.1 Crucially, RL algorithms can handle delayed gratification-recognising that optimal long-term strategies may require short-term sacrifices or temporary penalties.2 The agent continuously balances exploration (attempting novel actions to discover new possibilities) with exploitation (leveraging known effective actions) to progressively improve cumulative rewards.1

Mathematical Foundation

The self-reinforcement algorithm updates a memory matrix according to the following routine at each iteration:

Given situation s, perform action a

Receive consequence situation s’

Compute state evaluation v(s') of the consequence situation

Update memory: w'(a,s) = w(a,s) + v(s')5

Practical Applications

RL has demonstrated transformative potential across multiple domains. Autonomous vehicles learn to navigate complex traffic environments by receiving rewards for safe driving behaviours and penalties for collisions or traffic violations.1 Game-playing AI systems, such as chess engines, learn winning strategies through repeated play and feedback on moves.3 Robotics applications leverage RL to develop complex motor skills, enabling robots to grasp objects, move efficiently, and perform delicate tasks in manufacturing, logistics, and healthcare settings.3

Distinction from Other Learning Paradigms

RL occupies a distinct position within machine learning’s three primary paradigms. Whereas supervised learning reduces errors between predicted and correct responses using labelled training data, and unsupervised learning identifies patterns in unlabelled data, RL relies on general evaluations of behaviour rather than explicit correct answers.4 This fundamental difference makes RL particularly suited to problems where optimal solutions are unknown a priori and must be discovered through environmental interaction.

Historical Context and Theoretical Foundations

Reinforcement learning emerged from psychological theories of animal learning and played pivotal roles in early artificial intelligence systems.4 The field has evolved to become one of the most powerful approaches for creating intelligent systems capable of solving complex, real-world problems in dynamic and uncertain environments.3

Related Theorist: Richard S. Sutton

Richard S. Sutton stands as one of the most influential figures in modern reinforcement learning theory and practice. Born in 1956, Sutton earned his PhD in computer science from the University of Massachusetts Amherst in 1984, where he worked alongside Andrew Barto-a collaboration that would fundamentally shape the field.

Sutton’s seminal contributions include the development of temporal-difference (TD) learning, a revolutionary algorithm that bridges classical conditioning from animal learning psychology with modern computational approaches. TD learning enables agents to learn from incomplete sequences of experience, updating value estimates based on predictions rather than waiting for final outcomes. This breakthrough proved instrumental in training the world-champion backgammon-playing program TD-Gammon in the early 1990s, demonstrating RL’s practical power.

In 1998, Sutton and Barto published Reinforcement Learning: An Introduction, which became the definitive textbook in the field.10 This work synthesised decades of research into a coherent framework, making RL accessible to researchers and practitioners worldwide. The book’s influence cannot be overstated-it established the mathematical foundations, terminology, and conceptual frameworks that continue to guide contemporary research.

Sutton’s career has spanned academia and industry, including positions at the University of Alberta and Google DeepMind. His work on policy gradient methods and actor-critic architectures provided theoretical underpinnings for deep reinforcement learning systems that achieved superhuman performance in complex domains. Beyond specific algorithms, Sutton championed the view that RL represents a fundamental principle of intelligence itself-that learning through interaction with environments is central to how intelligent systems, biological or artificial, acquire knowledge and capability.

His intellectual legacy extends beyond technical contributions. Sutton advocated for RL as a unifying framework for understanding intelligence, arguing that the reward signal represents the true objective of learning systems. This perspective has influenced how researchers conceptualise artificial intelligence, shifting focus from pattern recognition towards goal-directed behaviour and autonomous decision-making in uncertain environments.

References

1. https://www.geeksforgeeks.org/machine-learning/what-is-reinforcement-learning/

2. https://aws.amazon.com/what-is/reinforcement-learning/

3. https://cloud.google.com/discover/what-is-reinforcement-learning

4. https://cacm.acm.org/federal-funding-of-academic-research/rediscovering-reinforcement-learning/

5. https://en.wikipedia.org/wiki/Reinforcement_learning

6. https://azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-reinforcement-learning

7. https://www.mathworks.com/discovery/reinforcement-learning.html

8. https://en.wikipedia.org/wiki/Machine_learning

9. https://www.ibm.com/think/topics/reinforcement-learning

10. https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

"Reinforcement Learning (RL) is a machine learning method where an agent learns optimal behavior through trial-and-error interactions with an environment, aiming to maximize a cumulative reward signal over time." - Term: Reinforcement Learning (RL)

read more

Polls

No Results Found

The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.

Services

Global Advisors is different

We help clients to measurably improve strategic decision-making and the results they achieve through defining clearly prioritised choices, reducing uncertainty, winning hearts and minds and partnering to deliver.

Our difference is embodied in our team. Our values define us.

Corporate portfolio strategy

Define optimal business portfolios aligned with investor expectations

BUSINESS UNIT STRATEGY

Define how to win against competitors

Reach full potential

Understand your business’ core, reach full potential and grow into optimal adjacencies

Deal advisory

M&A, due diligence, deal structuring, balance sheet optimisation

Global Advisors Digital Data Analytics

14 years of quantitative and data science experience

An enabler to delivering quantified strategy and accelerated implementation

Digital enablement, acceleration and data science

Leading-edge data science and digital skills

Experts in large data processing, analytics and data visualisation

Developers of digital proof-of-concepts

An accelerator for Global Advisors and our clients

Join Global Advisors

We hire and grow amazing people

Consultants join our firm based on a fit with our values, culture and vision. They believe in and are excited by our differentiated approach. They realise that working on our clients’ most important projects is a privilege. While the problems we solve are strategic to clients, consultants recognise that solutions primarily require hard work – rigorous and thorough analysis, partnering with client team members to overcome political and emotional obstacles, and a large investment in knowledge development and self-growth.

Get In Touch

16th Floor, The Forum, 2 Maude Street, Sandton, Johannesburg, South Africa
+27114616371

Global Advisors | Quantified Strategy Consulting