ARTIFICIAL INTELLIGENCE
An AI-native strategy firmGlobal Advisors: a consulting leader in defining quantified strategy, decreasing uncertainty, improving decisions, achieving measureable results.
A Different Kind of Partner in an AI World
AI-native strategy
consulting
Experienced hires
We are hiring experienced top-tier strategy consultants
Quantified Strategy
Decreased uncertainty, improved decisions
Global Advisors is a leader in defining quantified strategies, decreasing uncertainty, improving decisions and achieving measureable results.
We specialise in providing highly-analytical data-driven recommendations in the face of significant uncertainty.
We utilise advanced predictive analytics to build robust strategies and enable our clients to make calculated decisions.
We support implementation of adaptive capability and capacity.
Our latest
Thoughts
No Results Found
The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.
Strategy Tools
No Results Found
The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.
Fast Facts
No Results Found
The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.
Selected News
Term: Gradient descent
“Gradient descent is a core optimization algorithm in artificial intelligence (AI) and machine learning used to find the optimal parameters for a model by minimizing a cost (or loss) function.” – Gradient descent
Gradient descent is a first-order iterative optimisation algorithm used to minimise a differentiable cost or loss function by adjusting model parameters in the direction of the steepest descent.4,1 It is fundamental in artificial intelligence (AI) and machine learning for training models such as linear regression, neural networks, and logistic regression by finding optimal parameters that reduce prediction errors.2,3
How Gradient Descent Works
The algorithm starts from an initial set of parameters and iteratively updates them using the formula:
?_{new} = ?_{old} - ? ?J(?)
where ? represents the parameters, ? is the learning rate (step size), and ?J(?) is the gradient of the cost function J.4,6 The negative gradient points towards the direction of fastest decrease, analogous to descending a valley by following the steepest downhill path.1,2
Key Components
- Learning Rate (?): Controls step size. Too small leads to slow convergence; too large may overshoot the minimum.1,2
- Cost Function: Measures model error, e.g., mean squared error (MSE) for regression.3
- Gradient: Partial derivatives indicating how to adjust each parameter.4
Types of Gradient Descent
| Type | Description | Advantages |
|---|---|---|
| Batch Gradient Descent | Uses entire dataset per update. | Stable convergence.5 |
| Stochastic Gradient Descent (SGD) | Updates per single example. | Faster for large data, escapes local minima.3 |
| Mini-Batch Gradient Descent | Uses small batches. | Balances speed and stability; most common in practice.5 |
Challenges and Solutions
- Local Minima: May trap in suboptimal points; SGD helps escape.2
- Slow Convergence: Addressed by momentum or adaptive rates like Adam.2
- Learning Rate Sensitivity: Techniques include scheduling or RMSprop.2
Key Theorist: Augustin-Louis Cauchy
Augustin-Louis Cauchy (1789-1857) is the pioneering mathematician behind the gradient descent method, formalising it in 1847 as a technique for minimising functions via iterative steps proportional to the anti-gradient.4 His work laid the foundation for modern optimisation in AI.
Biography
Born in Paris during the French Revolution, Cauchy showed prodigious talent, entering École Centrale du Panthéon in 1802 and École Polytechnique in 1805. He contributed profoundly to analysis, introducing rigorous definitions of limits, convergence, and complex functions. Despite political exiles under Napoleon and later regimes, he produced over 800 papers, influencing fields from elasticity to optics. Cauchy served as a professor at the École Polytechnique and Sorbonne, though his ultramontane Catholic views led to professional conflicts.4
Relationship to Gradient Descent
In his 1847 memoir “Méthode générale pour la résolution des systèmes d’équations simultanées,” Cauchy described an iterative process equivalent to gradient descent: updating variables by subtracting a positive multiple of partial derivatives. This predates widespread use in machine learning by over a century, where it powers backpropagation in neural networks. Unlike later variants, Cauchy’s original focused on continuous optimisation without batching, but its core principle remains unchanged.4
Legacy
Cauchy’s method enabled scalable training of deep learning models, transforming AI from theoretical to practical. Modern enhancements like Adam build directly on his foundational algorithm.2,4
References
1. https://www.geeksforgeeks.org/data-science/what-is-gradient-descent/
2. https://www.datacamp.com/tutorial/tutorial-gradient-descent
3. https://www.geeksforgeeks.org/machine-learning/gradient-descent-algorithm-and-its-variants/
4. https://en.wikipedia.org/wiki/Gradient_descent
5. https://builtin.com/data-science/gradient-descent
7. https://www.ibm.com/think/topics/gradient-descent
8. https://www.youtube.com/watch?v=i62czvwDlsw

Polls
No Results Found
The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.
Services
Global Advisors is different
We help clients to measurably improve strategic decision-making and the results they achieve through defining clearly prioritised choices, reducing uncertainty, winning hearts and minds and partnering to deliver.
Our difference is embodied in our team. Our values define us.
Corporate portfolio strategy
Define optimal business portfolios aligned with investor expectations
BUSINESS UNIT STRATEGY
Define how to win against competitors
Reach full potential
Understand your business’ core, reach full potential and grow into optimal adjacencies
Deal advisory
M&A, due diligence, deal structuring, balance sheet optimisation
Global Advisors Digital Data Analytics
14 years of quantitative and data science experience
An enabler to delivering quantified strategy and accelerated implementation
Digital enablement, acceleration and data science
Leading-edge data science and digital skills
Experts in large data processing, analytics and data visualisation
Developers of digital proof-of-concepts
An accelerator for Global Advisors and our clients
Join Global Advisors
We hire and grow amazing people
Consultants join our firm based on a fit with our values, culture and vision. They believe in and are excited by our differentiated approach. They realise that working on our clients’ most important projects is a privilege. While the problems we solve are strategic to clients, consultants recognise that solutions primarily require hard work – rigorous and thorough analysis, partnering with client team members to overcome political and emotional obstacles, and a large investment in knowledge development and self-growth.
Get In Touch
16th Floor, The Forum, 2 Maude Street, Sandton, Johannesburg, South Africa
+27114616371
