ARTIFICIAL INTELLIGENCE
An AI-native strategy firmGlobal Advisors: a consulting leader in defining quantified strategy, decreasing uncertainty, improving decisions, achieving measureable results.
A Different Kind of Partner in an AI World
AI-native strategy
consulting
Experienced hires
We are hiring experienced top-tier strategy consultants
Quantified Strategy
Decreased uncertainty, improved decisions
Global Advisors is a leader in defining quantified strategies, decreasing uncertainty, improving decisions and achieving measureable results.
We specialise in providing highly-analytical data-driven recommendations in the face of significant uncertainty.
We utilise advanced predictive analytics to build robust strategies and enable our clients to make calculated decisions.
We support implementation of adaptive capability and capacity.
Our latest
Thoughts
Global Advisors’ Thoughts: Is insecurity behind that dysfunction?
By Marc Wilson
Marc is a partner at Global Advisors and based in Johannesburg, South Africa
Download this article at http://www.globaladvisors.biz/inc-feed/20170907/thoughts-is-insecurity-behind-that-dysfunction
We tend to characterise insecurity as what we see in overtly fragile, shy and awkward people. We think that their insecurity presents as lack of confidence. And often we associate it with under-achievement.
Sometimes we might be aware that insecurities can lie behind the -ias, -isms and the phobias. Body dysmorphia? Insecurity about attractiveness. Racism? Often the need to find security by claiming superiority, belonging to group with power, a group you understand and whose acceptance you want. Homophobia? Often insecurity about one’s own sexuality or masculinity / feminity.
So it is often counter-intuitive when we discover that often behind incredible success lies – insecurity! In fact, an article I once read described the successful elite of strategy consulting firms as typically “insecure over-achievers.”
Insecurity must be one of the most misunderstood drivers of dysfunction. Instead we see its related symptoms and react to those. “That woman is so overbearing. That guy is so aggressive! That girl is so self-absorbed. That guy is so competitive.” Even, “That guy is so arrogant.”
How is it that someone we might perceive as competitive, arrogant or overconfident might be insecure? Sometimes people overcompensate to hide a weakness or insecurity. Sometimes in an effort to avoid feeling defensive of a perceived shortcoming, they might go on the offensive – telling people they are the opposite or even faking security.
Do we even know what insecurity is? The very need to…
Read the rest of “Power, Control and Space” at http://www.globaladvisors.biz/inc-feed/20170907/thoughts-is-insecurity-behind-that-dysfunction
Strategy Tools
Your due diligence is most likely wrong
As many as 70 – 90% of deals fail to create value for acquirers. The majority of these deals were the subject of commercial or strategic due diligences (DDs). Many DDs are rubber stamps – designed to motivate an investment to shareholders. Yet the requirements for a value-adding DD go beyond this.
Strategic due diligence must test investees against uncertainty via a variety of methods that include scenarios, probabilised forecasts and stress tests to ensure that investees are value accretive.
Firms that invest during downturns outperform those who don’t. DDs undertaken during downturns have a particularly difficult task – how to assess the future prospects of an investee when the future is so uncertain.
There is clearly an integrated approach to successful due diligence – despite the challenges posed by uncertainty.
Read more…
Fast Facts
The use of full absorption or average costing in asset-intensive industries with under-utilisation can lead to self-defeating pricing strategies
The use of full absorption or average costing in asset-intensive industries with under-utilisation can lead to self-defeating pricing strategies
- The use of full absorption or average costing in a manufacturing environment with under-utilisation can lead to self-defeating pricing strategies
- The increase in price to cover costs results in volume decreases – lowering factory utilisation and increasing unit production costs. This is the start of the utilisation-pricing “death spiral”
- Costing according to factory utilisation – partial absorption costing – offers the opportunity to be more strategic about costing and utilisation
- “Unabsorbed” costs can be targeted through OEE and volume improvements. At the same time, the “disadvantage” of having a large factory is normalised and pricing can compete with more fully-utilised factories
- A recent manufacturing client saw 60% of unit costs arise from factory under-utilisation – sub-optimal OEE levels (non-conformance), low volumes and work-centre bottlenecks contributed to the utilisation gap
- These principles can apply to any asset-intensive business – for example banking
Selected News
Term: Mixture of Experts (MoE)
“Mixture of Experts (MoE) is an efficient neural network architecture that uses multiple specialised sub-models (experts) and a gating network (router) to dynamically select and activate only the most relevant experts for a given input.” – Mixture of Experts (MoE)
This architectural approach divides a large artificial intelligence model into separate sub-networks, each specialising in processing specific types of input data. Rather than activating the entire network for every task, MoE models employ a gating mechanism-often called a router-that intelligently selects which experts should process each input. This selective activation introduces sparsity into the network, meaning only a fraction of the model’s total parameters are used for any given computation.1,3
Core Architecture and Components
The fundamental structure of MoE consists of two essential elements:4
- Expert networks: Multiple specialised sub-networks, typically implemented as feed-forward neural networks (FFNs), each with its own set of learnable parameters. These experts become skilled at handling specific patterns or types of data during training.1
- Gating network (router): A trainable mechanism that evaluates each input and determines which expert or combination of experts is best suited to process it. This routing function is computationally efficient, enabling the model to make rapid decisions about expert selection.1,3
In practical implementations, such as the Mixtral 8x7B language model, each layer contains multiple experts-for instance, eight separate feedforward blocks with 7 billion parameters each. For every token processed, the router selects only a subset of these experts (in Mixtral’s case, two out of eight) to perform the computation, then combines their outputs before passing the result to the next layer.3
How MoE Achieves Efficiency
MoE models leverage conditional computation to reduce computational burden without sacrificing model capacity.3 This approach enables several efficiency gains:
- Models can scale to billions of parameters whilst maintaining manageable inference costs, since not all parameters are activated for every input.1,3
- Training can occur with significantly less compute, allowing researchers to either reduce training time or expand model and dataset sizes.4
- Experts can be distributed across multiple devices through expert parallelism, enabling efficient large-scale deployments.1
The gating mechanism ensures that frequently selected experts receive continuous updates during training, improving their performance, whilst load balancing mechanisms attempt to distribute computational work evenly across experts to prevent bottlenecks.1
Historical Development and Key Theorist: Noam Shazeer
Noam Shazeer stands as the primary architect of modern MoE systems in deep learning. In 2017, Shazeer and colleagues-including the legendary Geoffrey Hinton and Google’s Jeff Dean-introduced the Sparsely-Gated Mixture-of-Experts Layer for recurrent neural language models.1,4 This seminal work fundamentally transformed how researchers approached scaling neural networks.
Shazeer’s contribution was revolutionary because it reintroduced the mixture of experts concept, which had existed in earlier machine learning literature, into the deep learning era. His team scaled this architecture to a 137-billion-parameter LSTM model, demonstrating that sparsity could maintain very fast inference even at massive scale.4 Although this initial work focused on machine translation and encountered challenges such as high communication costs and training instabilities, it established the theoretical and practical foundation for all subsequent MoE research.4
Shazeer’s background as a researcher at Google positioned him at the intersection of theoretical machine learning and practical systems engineering. His work exemplified a crucial insight: that not all parameters in a neural network need to be active simultaneously. This principle has since become foundational to modern large language model design, influencing architectures used by leading AI organisations worldwide. The Sparsely-Gated Mixture-of-Experts Layer introduced the trainable gating network concept that remains central to MoE implementations today, enabling conditional computation that balances model expressiveness with computational efficiency.1
Applications and Performance
MoE architectures have demonstrated faster training and comparable or superior performance to dense language models on many benchmarks, particularly in multi-domain tasks where different experts can specialise in different knowledge areas.1 Applications span natural language processing, computer vision, and recommendation systems.2
Challenges and Considerations
Despite their advantages, MoE systems present implementation challenges. Load balancing remains critical-when experts are distributed across multiple devices, uneven expert selection can create memory and computational bottlenecks, with some experts handling significantly more tokens than others.1 Additionally, distributed training complexity and the need for careful tuning to maintain stability and efficiency require sophisticated engineering approaches.1
References
1. https://neptune.ai/blog/mixture-of-experts-llms
2. https://www.datacamp.com/blog/mixture-of-experts-moe
3. https://www.ibm.com/think/topics/mixture-of-experts
4. https://huggingface.co/blog/moe
5. https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mixture-of-experts
6. https://www.youtube.com/watch?v=sYDlVVyJYn4
7. https://arxiv.org/html/2503.07137v1
8. https://cameronrwolfe.substack.com/p/moe-llms

Polls
No Results Found
The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.
Services
Global Advisors is different
We help clients to measurably improve strategic decision-making and the results they achieve through defining clearly prioritised choices, reducing uncertainty, winning hearts and minds and partnering to deliver.
Our difference is embodied in our team. Our values define us.
Corporate portfolio strategy
Define optimal business portfolios aligned with investor expectations
BUSINESS UNIT STRATEGY
Define how to win against competitors
Reach full potential
Understand your business’ core, reach full potential and grow into optimal adjacencies
Deal advisory
M&A, due diligence, deal structuring, balance sheet optimisation
Global Advisors Digital Data Analytics
14 years of quantitative and data science experience
An enabler to delivering quantified strategy and accelerated implementation
Digital enablement, acceleration and data science
Leading-edge data science and digital skills
Experts in large data processing, analytics and data visualisation
Developers of digital proof-of-concepts
An accelerator for Global Advisors and our clients
Join Global Advisors
We hire and grow amazing people
Consultants join our firm based on a fit with our values, culture and vision. They believe in and are excited by our differentiated approach. They realise that working on our clients’ most important projects is a privilege. While the problems we solve are strategic to clients, consultants recognise that solutions primarily require hard work – rigorous and thorough analysis, partnering with client team members to overcome political and emotional obstacles, and a large investment in knowledge development and self-growth.
Get In Touch
16th Floor, The Forum, 2 Maude Street, Sandton, Johannesburg, South Africa
+27114616371
