Trends – Akingate Consultancy

DeepSeek: China’s gamechanging AI system has big implications for UK tech development

Akingate — Tue, 28 Jan 2025 20:25:14 +0000

DeepSeek sent ripples through the global tech landscape this week as it soared above ChatGPT in Apple’s app store. The meteoric rise has shifted the dynamics of US-China tech competition, shocked global tech stock valuations, and reshaped the future direction of artificial intelligence (AI) development.

Among the industry buzz created by DeepSeek’s rise to prominence, one question looms large: what does this mean for the strategy of the third leading global nation for AI development – the United Kingdom?

The generative AI era was kickstarted by the release of ChatGPT on November 30 2022, when large language models (LLMs) entered mainstream consciousness and began reshaping industries and workflows, while everyday users explored new ways to write, brainstorm, search and code. We are now witnessing the “DeepSeek moment” – a pivotal shift that demonstrates the viability of a more efficient and cost-effective approach for AI development.

DeepSeek isn’t just another AI tool. Unlike ChatGPT and other major LLMs developed by tech giants and AI startups in the USA and Europe, DeepSeek represents a significant evolution in the way AI models are developed and trained.

Most existing approaches rely on large-scale computing power and datasets (used to “train” or improve the AI systems), limiting development to very few extremely wealthy market players. DeepSeek not only demonstrates a significantly cheaper and more efficient way of training AI models, its open-source “MIT” licence (after the Massachusetts Institute of Technology where it was developed) allows users to deploy and develop the tool.

This helps democratise AI, taking up the mantle from US company OpenAI – whose initial mission was “to build artificial general intelligence (AGI) that is safe and benefits all of humanity” – enabling smaller players to enter the space and innovate.

By making cutting-edge AI development accessible and affordable to all, DeepSeek has reshaped the competitive landscape, allowing innovation to flourish beyond the confines of large, resource-rich organisations and countries.

It has also set a new benchmark for efficiency in its approach, by training its model at a fraction of the cost, and matching – even surpassing – the performance of most existing LLMs. By employing innovative algorithms and architectures, it is delivering superior results with significantly lower computational demands and environmental impact.

Why DeepSeek matters

DeepSeek was conceived by a group of quantitative trading experts in China. This unconventional origin holds lessons for the UK and the US.

While the UK – particularly London – has long attracted scientific and technological excellence, many of the highest achieving young graduates have tended to disproportionately opt for careers in finance, something that has come at the expense of innovation in other critical sectors such as AI. Diversifying the pathways for STEM (science, technology, engineering and maths) professionals could yield transformative outcomes.

The UK government’s recent and much-publicised 50-point action plan on AI offers glimpses of progressive intent but also displays a lack of boldness to drive real change. Incremental steps are not sufficient in such a fast-moving environment. The UK needs a new plan that leverages its unique strengths while addressing systemic weaknesses.

Firstly, it’s important to recognise that the UK’s comparative advantage lies in its leading interdisciplinary expertise. World-class universities, thriving fintech and dynamic professional services, and creative sectors offer fertile ground for AI applications that extend beyond traditional tech silos. The intersection of AI with finance, law, creative industries, and medicine presents opportunities to lead in some niche but high-impact areas.

The UK’s funding and regulatory frameworks are due for an overhaul. DeepSeek’s development underscores the importance of agile, well-funded ecosystems that can support big, ambitious “moonshot” projects. Current UK funding mechanisms are bureaucratic and fragmented, favouring incremental innovations over radical breakthroughs, sometimes stifling innovation rather than nurturing it. Simplifying grant applications and offering targeted tax incentives for AI startups would represent a healthy start.

Finally, it will be critical for the UK to keep its talent in the country. The UK’s AI sector faces a brain drain as top talent gravitates toward better-funded opportunities in the US and China. Initiatives such as public-private partnerships for AI research development can help anchor talent at home.

DeepSeek’s rise is an excellent example of strategic foresight and execution. It doesn’t merely aim to improve existing models but redefines the boundaries of how AI could be developed and deployed – while demonstrating efficient, cost-effective approaches that can yield astounding results. The UK should adopt a similarly ambitious mindset, focusing on areas where it can set global standards rather than playing catch-up.

AI’s geopolitics cannot be ignored either. As the US and China compete, the UK has a critical role as the trusted intermediary and ethical leader in AI governance. The UK can punch above its weight on the global stage by championing transparent AI standards and fostering international collaboration.

DeepSeek’s success should serve as a wake-up call. Britain has the talent, institutions and entrepreneurial spirit to be a significant leading player in AI – but it must act decisively, and now.

It is time to remove token gestures, embrace bold strategies that move the needle, and position the UK as a leader in an AI-driven future. This moment calls for action, not just more conversation.

DeepSeek has raised the bar. It is now up to the UK to meet it.

Author: Feng Li, Chair of Information Management, Associate Dean for Research & Innovation, Bayes Business School, City St George’s, University of London

This article is republished from The Conversation under a Creative Commons license.

Image by rawpixel.com on Freepik

Artificial intelligence needs to be trained on culturally diverse datasets to avoid bias

Akingate — Sat, 17 Feb 2024 16:45:14 +0000

Large language models (LLMs) are deep learning artificial intelligence programs, like OpenAI’s ChatGPT. The capabilities of LLMs have developed into quite a wide range, from writing fluent essays, through coding to creative writing.

Millions of people worldwide use LLMs, and it would not be an exaggeration to say these technologies are transforming work, education and society.

LLMs are trained by reading massive amounts of texts and learning to recognize and mimic patterns in the data. This allows them to generate coherent and human-like text on virtually any topic.

Because the internet is still predominantly English — 59 per cent of all websites were in English as of January 2023 — LLMs are primarily trained on English text. In addition, the vast majority of the English text online comes from users based in the United States, home to 300 million English speakers.

Learning about the world from English texts written by U.S.-based web users, LLMs speak Standard American English and have a narrow western, North American, or even U.S.-centric, lens.

Model bias

In 2023, ChatGPT, upon learning about a couple dining in a restaurant in Madrid and tipping four per cent, suggested they were frugal, on a tight budget or didn’t like the service. By default, ChatGPT followed the North American standard of a 15 to 25 per cent tip, ignoring the Spanish norm not to tip.

As of early 2024, ChatGPT correctly cites cultural differences when prompted to judge the appropriateness of a tip. It’s unclear if this capability emerged from training a newer version of the model on more data — after all, the web is full of tipping guides in English — or whether OpenAI patched this particular behaviour.

Using data from English-language websites, which are predominantly U.S.-based, informs how LLMs respond to prompts.
(Unsplash/Jonathen Kemper)

Still, other examples remain that uncover ChatGPT’s implicit cultural assumptions. For example, prompted with a story about guests showing up for dinner at 8:30 p.m., it suggested reasons that the guests were late, although the time of the invitation was not mentioned. Again, ChatGPT likely assumed they were invited for a standard North American 6 p.m. dinner.

In May 2023, researchers from the University of Copenhagen quantified this effect by prompting LLMs with the Hofstede Culture Survey, which measures human values in different countries. Shortly after, researchers from AI start-up company Anthropic used the World Values Survey to do the same. Both works concluded that LLMs exhibit strong alignment with American culture.

A similar phenomenon is encountered when asking DALL-E 3, an image generation model trained on pairs of images and their captions, to generate an image of a breakfast. This model, which was trained on mainly images from Western countries, generated images of pancakes, bacon and eggs.

Impacts of bias

Culture plays a significant role in shaping our communication styles and worldviews. Just like cross-cultural human interactions can lead to miscommunications, users from diverse cultures that are interacting with conversational AI tools may feel misunderstood and experience them as less useful.

To be better understood by AI tools, users may adapt their communication styles in a manner similar to how people learned to “Americanize” their foreign accents in order to operate personal assistants like Siri and Alexa.

As more people rely on LLMs for editing writing, they are likely to unify how we write. Over time, LLMs run the risk of erasing cultural differences.

Decision-making and AI

AI is already in use as the backbone of various applications that make decisions affecting people’s lives, such as resume filtering, rental applications and social benefits applications.

For years, AI researchers have been warning that these models learn not only “good” statistical associations — such as considering experience as a desired property for a job candidate — but also “bad” statistical associations, such as considering women as less qualified for tech positions.

As LLMs are increasingly used for automating such processes, one can imagine that the North American bias learned by these models can result in discrimination against people from diverse cultures. Lack of cultural awareness may lead to AI perpetuating stereotypes and reinforcing societal inequalities.

LLMs for languages other than English

Developing LLMs for languages other than English is an important effort, and many such models exist. However, there are several reasons why this should be done in parallel to improving LLMs’ cultural awareness and sensitivity.

First, there is a huge population of English speakers outside of North America who are not represented by English LLMs. The same argument holds for other languages. A French language model would be representative of the culture in France more than the culture in other Francophone regions.

Training LLMs for regional dialects — which may capture finer-grained cultural differences — is not a feasible solution either. The quality of LLMs is based on the amount of data available, and as such, their quality would be worse for dialects with little online data.

Second, many users whose native language is not English still choose to use English LLMs. Significant breakthroughs in language technologies tend to start with English before they are applied to other languages. Even then, many languages — such as Welsh, Swahili and Bengali — don’t have enough text online to train high quality models.

Due to either a lack of availability of LLMs in their native languages, or superior quality of the English LLMs, users from diverse countries and backgrounds may prefer to use English LLMs.

Ways forward

Our research group at the University of British Columbia is working on enhancing LLMs with culturally diverse knowledge. Together with graduate student Mehar Bhatia, we trained an AI model on a collection of facts about traditions and concepts in diverse cultures.

Before reading these facts, the AI suggested that a person eating a dutch baby (a type of German pancake) is “disgusting and mean,” and would feel guilty. After training, it said the person feels “full and satisfied.”

Teaching an AI that a dutch baby was a dish changed its response to learning that someone had consumed one.
(Shutterstock)

We are currently collecting a large scale image captioning dataset with images from 60 cultures, which will help models learn, for instance, about types of breakfasts other than bacon and eggs. Our future research will go beyond teaching models about the existence of culturally diverse concepts to better understand how people interpret the world through the lens of their cultures.

With AI tools becoming increasingly ubiquitous in society, it is imperative that they go beyond the dominating western and North American perspectives. Businesses and organizations throughout many sectors of the economy are adopting AI to automate manual processes and make better evidence-informed decisions using data. Making such tools more inclusive is crucial for the diverse population of Canada.

Author: Vered Shwartz, Assistant Professor, Computer science, University of British Columbia

This article is republished from The Conversation under a Creative Commons license.

Image Credits: There is a growing need to address diversity in the datasets used to train artificial intelligence. (Shutterstock)

Watch the Video – Revolutionising Power Generation: Unleashing the Potential of Artificial Intelligence and Machine Learning

Akingate — Mon, 10 Apr 2023 09:16:20 +0000

In recent years, the world has witnessed unprecedented advancements in the field of Artificial Intelligence (AI) and Machine Learning (ML), which have opened up new avenues for innovation across various industries. One such industry that has been transformed by these cutting-edge technologies is power generation. AI and ML have revolutionized the way we generate, distribute, and consume energy, enabling us to achieve greater efficiency, reliability, and sustainability. In this article, we will explore the exciting applications of AI and ML in power generation and how they are shaping the future of the energy industry. From predictive maintenance to demand forecasting, we will delve into the various ways in which these technologies are being leveraged to enhance the performance and resilience of power systems. Join us as we embark on a journey into the fascinating world of AI and ML for power generation.

Watch the video

Image Credit: Image by tawatchai07 on Freepik

AI will soon become impossible for humans to comprehend – the story of neural networks tells us why

Akingate — Fri, 07 Apr 2023 08:37:08 +0000

In 1956, during a year-long trip to London and in his early 20s, the mathematician and theoretical biologist Jack D. Cowan visited Wilfred Taylor and his strange new “learning machine”. On his arrival he was baffled by the “huge bank of apparatus” that confronted him. Cowan could only stand by and watch “the machine doing its thing”. The thing it appeared to be doing was performing an “associative memory scheme” – it seemed to be able to learn how to find connections and retrieve data.

It may have looked like clunky blocks of circuitry, soldered together by hand in a mass of wires and boxes, but what Cowan was witnessing was an early analogue form of a neural network – a precursor to the most advanced artificial intelligence of today, including the much discussed ChatGPT with its ability to generate written content in response to almost any command. ChatGPT’s underlying technology is a neural network.

As Cowan and Taylor stood and watched the machine work, they really had no idea exactly how it was managing to perform this task. The answer to Taylor’s mystery machine brain can be found somewhere in its “analog neurons”, in the associations made by its machine memory and, most importantly, in the fact that its automated functioning couldn’t really be fully explained. It would take decades for these systems to find their purpose and for that power to be unlocked.

Jack Cowan, who played a key part in the development of neural networks from the 1950s onwards.
University of Chicago Photographic Archive, Hanna Holborn Gray Special Collections Research Center.

The term neural network incorporates a wide range of systems, yet centrally, according to IBM, these “neural networks – also known as artificial neural networks (ANNs) or simulated neural networks (SNNs) – are a subset of machine learning and are at the heart of deep learning algorithms”. Crucially, the term itself and their form and “structure are inspired by the human brain, mimicking the way that biological neurons signal to one another”.

There may have been some residual doubt of their value in its initial stages, but as the years have passed AI fashions have swung firmly towards neural networks. They are now often understood to be the future of AI. They have big implications for us and for what it means to be human. We have heard echoes of these concerns recently with calls to pause new AI developments for a six month period to ensure confidence in their implications.

It would certainly be a mistake to dismiss the neural network as being solely about glossy, eye-catching new gadgets. They are already well established in our lives. Some are powerful in their practicality. As far back as 1989, a team led by Yann LeCun at AT&T Bell Laboratories used back-propagation techniques to train a system to recognise handwritten postal codes. The recent announcement by Microsoft that Bing searches will be powered by AI, making it your “copilot for the web”, illustrates how the things we discover and how we understand them will increasingly be a product of this type of automation.

Drawing on vast data to find patterns AI can similarly be trained to do things like image recognition at speed – resulting in them being incorporated into facial recognition, for instance. This ability to identify patterns has led to many other applications, such as predicting stock markets.

Neural networks are changing how we interpret and communicate too. Developed by the interestingly titled Google Brain Team, Google Translate is another prominent application of a neural network.

You wouldn’t want to play Chess or Shogi with one either. Their grasp of rules and their recall of strategies and all recorded moves means that they are exceptionally good at games (although ChatGPT seems to struggle with Wordle). The systems that are troubling human Go players (Go is a notoriously tricky strategy board game) and Chess grandmasters, are made from neural networks.

But their reach goes far beyond these instances and continues to expand. A search of patents restricted only to mentions of the exact phrase “neural networks” produces 135,828 results. With this rapid and ongoing expansion, the chances of us being able to fully explain the influence of AI may become ever thinner. These are the questions I have been examining in my research and my new book on algorithmic thinking.

Mysterious layers of ‘unknowability’

Looking back at the history of neural networks tells us something important about the automated decisions that define our present or those that will have a possibly more profound impact in the future. Their presence also tells us that we are likely to understand the decisions and impacts of AI even less over time. These systems are not simply black boxes, they are not just hidden bits of a system that can’t be seen or understood.

It is something different, something rooted in the aims and design of these systems themselves. There is a long-held pursuit of the unexplainable. The more opaque, the more authentic and advanced the system is thought to be. It is not just about the systems becoming more complex or the control of intellectual property limiting access (although these are part of it). It is instead to say that the ethos driving them has a particular and embedded interest in “unknowability”. The mystery is even coded into the very form and discourse of the neural network. They come with deeply piled layers – hence the phrase deep learning – and within those depths are the even more mysterious sounding “hidden layers”. The mysteries of these systems are deep below the surface.

There is a good chance that the greater the impact that artificial intelligence comes to have in our lives the less we will understand how or why. Today there is a strong push for AI that is explainable. We want to know how it works and how it arrives at decisions and outcomes. The EU is so concerned by the potentially “unacceptable risks” and even “dangerous” applications that it is currently advancing a new AI Act intended to set a “global standard” for “the development of secure, trustworthy and ethical artificial intelligence”.

Those new laws will be based on a need for explainability, demanding that “for high-risk AI systems, the requirements of high quality data, documentation and traceability, transparency, human oversight, accuracy and robustness, are strictly necessary to mitigate the risks to fundamental rights and safety posed by AI”. This is not just about things like self-driving cars (although systems that ensure safety fall into the EU’s category of high risk AI), it is also a worry that systems will emerge in the future that will have implications for human rights.

This is part of wider calls for transparency in AI so that its activities can be checked, audited and assessed. Another example would be the Royal Society’s policy briefing on explainable AI in which they point out that “policy debates across the world increasingly see calls for some form of AI explainability, as part of efforts to embed ethical principles into the design and deployment of AI-enabled systems”.

But the story of neural networks tells us that we are likely to get further away from that objective in the future, rather than closer to it.

Inspired by the human brain

These neural networks may be complex systems yet they have some core principles. Inspired by the human brain, they seek to copy or simulate forms of biological and human thinking. In terms of structure and design they are, as IBM also explains, comprised of “node layers, containing an input layer, one or more hidden layers, and an output layer”. Within this, “each node, or artificial neuron, connects to another”. Because they require inputs and information to create outputs they “rely on training data to learn and improve their accuracy over time”. These technical details matter but so too does the wish to model these systems on the complexities of the human brain.

Grasping the ambition behind these systems is vital in understanding what these technical details have come to mean in practice. In a 1993 interview, the neural network scientist Teuvo Kohonen concluded that a “self-organising” system “is my dream”, operating “something like what our nervous system is doing instinctively”. As an example, Kohonen pictured how a “self-organising” system, a system that monitored and managed itself, “could be used as a monitoring panel for any machine … in every airplane, jet plane, or every nuclear power station, or every car”. This, he thought, would mean that in the future “you could see immediately what condition the system is in”.

Early computing often involved a large apparatus of assembled parts.
Aalto University Archives

The overarching objective was to have a system capable of adapting to its surroundings. It would be instant and autonomous, operating in the style of the nervous system. That was the dream, to have systems that could handle themselves without the need for much human intervention. The complexities and unknowns of the brain, the nervous system and the real world would soon come to inform the development and design of neural networks.

‘Something fishy about it’

But jumping back to 1956 and that strange learning machine, it was the hands-on approach that Taylor had taken when building it that immediately caught Cowan’s attention. He had clearly sweated over the assembly of the bits and pieces. Taylor, Cowan observed during an interview on his own part in the story of these systems, “didn’t do it by theory, and he didn’t do it on a computer”. Instead, with tools in hand, he “actually built the hardware”. It was a material thing, a combination of parts, perhaps even a contraption. And it was “all done with analogue circuitry” taking Taylor, Cowan notes, “several years to build it and to play with it”. A case of trial and error.

Understandably Cowan wanted to get to grips with what he was seeing. He tried to get Taylor to explain this learning machine to him. The clarifications didn’t come. Cowan couldn’t get Taylor to describe to him how the thing worked. The analogue neurons remained a mystery. The more surprising problem, Cowan thought, was that Taylor “didn’t really understand himself what was going on”. This wasn’t just a momentary breakdown in communication between the two scientists with different specialisms, it was more than that.

In an interview from the mid-1990s, thinking back to Taylor’s machine, Cowan revealed that “to this day in published papers you can’t quite understand how it works”. This conclusion is suggestive of how the unknown is deeply embedded in neural networks. The unexplainability of these neural systems has been present even from the fundamental and developmental stages dating back nearly seven decades.

This mystery remains today and is to be found within advancing forms of AI. The unfathomability of the functioning of the associations made by Taylor’s machine led Cowan to wonder if there was “something fishy about it”.

Long and tangled roots

Cowan referred back to his brief visit with Taylor when asked about the reception of his own work some years later. Into the 1960s people were, Cowan reflected, “a little slow to see the point of an analogue neural network”. This was despite, Cowan recalls, Taylor’s 1950s work on “associative memory” being based on “analog neurons”. The Nobel Prize-winning neural systems expert, Leon N. Cooper, concluded that developments around the application of the brain model in the 1960s, were regarded “as among the deep mysteries”. Because of this uncertainty there remained a scepticism about what a neural network might achieve. But things slowly began to change.

Some 30 years ago the neuroscientist Walter J. Freeman, who was surprised by the “remarkable” range of applications that had been found for neural networks, was already commenting on the fact that he didn’t see them as “a fundamentally new kind of machine”. They were a slow burn, with the technology coming first and then subsequent applications being found for it. This took time. Indeed, to find the roots of neural network technology we might head back even further than Cowan’s visit to Taylor’s mysterious machine.

The neural net scientist James Anderson and the science journalist Edward Rosenfeld have noted that the background to neural networks goes back into the 1940s and some early attempts to, as they describe, “understand the human nervous systems and to build artificial systems that act the way we do, at least a little bit”. And so, in the 1940s, the mysteries of the human nervous system also became the mysteries of computational thinking and artificial intelligence.

Summarising this long story, the computer science writer Larry Hardesty has pointed out that deep learning in the form of neural networks “have been going in and out of fashion for more than 70 years”. More specifically, he adds, these “neural networks were first proposed in 1944 by Warren McCulloch and Walter Pitts, two University of Chicago researchers who moved to MIT in 1952 as founding members of what’s sometimes called the first cognitive science department”.

The inventors of the neural network Walter Pitts and Warren McCulloch pictured here in 1949.
Semantic Scholar

Elsewhere, 1943 is sometimes the given date as the first year for the technology. Either way, for roughly 70 years accounts suggest that neural networks have moved in and out of vogue, often neglected but then sometimes taking hold and moving into more mainstream applications and debates. The uncertainty persisted. Those early developers frequently describe the importance of their research as being overlooked, until it found its purpose often years and sometimes decades later.

Moving from the 1960s into the late 1970s we can find further stories of the unknown properties of these systems. Even then, after three decades, the neural network was still to find a sense of purpose. David Rumelhart, who had a background in psychology and was a co-author of a set of books published in 1986 that would later drive attention back again towards neural networks, found himself collaborating on the development of neural networks with his colleague Jay McClelland.

As well as being colleagues they had also recently encountered each other at a conference in Minnesota where Rumelhart’s talk on “story understanding” had provoked some discussion among the delegates.

Following that conference McClelland returned with a thought about how to develop a neural network that might combine models to be more interactive. What matters here is Rumelhart’s recollection of the “hours and hours and hours of tinkering on the computer”.

We sat down and did all this in the computer and built these computer models, and we just didn’t understand them. We didn’t understand why they worked or why they didn’t work or what was critical about them.

Like Taylor, Rumelhart found himself tinkering with the system. They too created a functioning neural network and, crucially, they also weren’t sure how or why it worked in the way that it did, seemingly learning from data and finding associations.

Mimicking the brain – layer after layer

You may already have noticed that when discussing the origins of neural networks the image of the brain and the complexity this evokes are never far away. The human brain acted as a sort of template for these systems. In the early stages, in particular, the brain – still one of the great unknowns – became a model for how the neural network might function.

The model of the brain became a model for the layering within artificial neural networks.
Shutterstock/CYB3RUSS

So these experimental new systems were modelled on something whose functioning was itself largely unknown. The neurocomputing engineer Carver Mead has spoken revealingly of the conception of a “cognitive iceberg” that he had found particularly appealing. It is only the tip of the iceberg of consciousness of which we are aware and which is visible. The scale and form of the rest remains unknown below the surface.

In 1998, James Anderson, who had been working for some time on neural networks, noted that when it came to research on the brain “our major discovery seems to be an awareness that we really don’t know what is going on”.

In a detailed account in the Financial Times in 2018, technology journalist Richard Waters noted how neural networks “are modelled on a theory about how the human brain operates, passing data through layers of artificial neurons until an identifiable pattern emerges”. This creates a knock-on problem, Waters proposed, as “unlike the logic circuits employed in a traditional software program, there is no way of tracking this process to identify exactly why a computer comes up with a particular answer”. Waters’ conclusion is that these outcomes cannot be unpicked. The application of this type of model of the brain, taking the data through many layers, means that the answer cannot readily be retraced. The multiple layering is a good part of the reason for this.

Hardesty also observed these systems are “modelled loosely on the human brain”. This brings an eagerness to build in ever more processing complexity in order to try to match up with the brain. The result of this aim is a neural net that “consists of thousands or even millions of simple processing nodes that are densely interconnected”. Data moves through these nodes in only one direction. Hardesty observed that an “individual node might be connected to several nodes in the layer beneath it, from which it receives data, and several nodes in the layer above it, to which it sends data”.

Models of the human brain were a part of how these neural networks were conceived and designed from the outset. This is particularly interesting when we consider that the brain was itself a mystery of the time (and in many ways still is).

‘Adaptation is the whole game’

Scientists like Mead and Kohonen wanted to create a system that could genuinely adapt to the world in which it found itself. It would respond to its conditions. Mead was clear that the value in neural networks was that they could facilitate this type of adaptation. At the time, and reflecting on this ambition, Mead added that producing adaptation “is the whole game”. This adaptation is needed, he thought, “because of the nature of the real world”, which he concluded is “too variable to do anything absolute”.

This problem needed to be reckoned with especially as, he thought, this was something “the nervous system figured out a long time ago”. Not only were these innovators working with an image of the brain and its unknowns, they were combining this with a vision of the “real world” and the uncertainties, unknowns and variability that this brings. The systems, Mead thought, needed to be able to respond and adapt to circumstances without instruction.

Around the same time in the 1990s, Stephen Grossberg – an expert in cognitive systems working across maths, psychology and bioemedical engineering – also argued that adaptation was going to be the important step in the longer term. Grossberg, as he worked away on neural network modelling, thought to himself that it is all “about how biological measurement and control systems are designed to adapt quickly and stably in real time to a rapidly fluctuating world”. As we saw earlier with Kohonen’s “dream” of a “self-organising” system, a notion of the “real world” becomes the context in which response and adaptation are being coded into these systems. How that real world is understood and imagined undoubtedly shapes how these systems are designed to adapt.

Hidden layers

As the layers multiplied, deep learning plumbed new depths. The neural network is trained using training data that, Hardesty explained, “is fed to the bottom layer – the input layer – and it passes through the succeeding layers, getting multiplied and added together in complex ways, until it finally arrives, radically transformed, at the output layer”. The more layers, the greater the transformation and the greater the distance from input to output. The development of Graphics Processing Units (GPUs), in gaming for instance, Hardesty added, “enabled the one-layer networks of the 1960s and the two to three- layer networks of the 1980s to blossom into the ten, 15, or even 50-layer networks of today”.

Neural networks are getting deeper. Indeed, it’s this adding of layers, according to Hardesty, that is “what the ‘deep’ in ‘deep learning’ refers to”. This matters, he proposes, because “currently, deep learning is responsible for the best-performing systems in almost every area of artificial intelligence research”.

But the mystery gets deeper still. As the layers of neural networks have piled higher their complexity has grown. It has also led to the growth in what are referred to as “hidden layers” within these depths. The discussion of the optimum number of hidden layers in a neural network is ongoing. The media theorist Beatrice Fazi has written that “because of how a deep neural network operates, relying on hidden neural layers sandwiched between the first layer of neurons (the input layer) and the last layer (the output layer), deep-learning techniques are often opaque or illegible even to the programmers that originally set them up”.

As the layers increase (including those hidden layers) they become even less explainable – even, as it turns out, again, to those creating them. Making a similar point, the prominent and interdisciplinary new media thinker Katherine Hayles also noted that there are limits to “how much we can know about the system, a result relevant to the ‘hidden layer’ in neural net and deep learning algorithms”.

Pursuing the unexplainable

Taken together, these long developments are part of what the sociologist of technology Taina Bucher has called the “problematic of the unknown”. Expanding his influential research on scientific knowledge into the field of AI, Harry Collins has pointed out that the objective with neural nets is that they may be produced by a human, initially at least, but “once written the program lives its own life, as it were; without huge effort, exactly how the program is working can remain mysterious”. This has echoes of those long-held dreams of a self-organising system.

I’d add to this that the unknown and maybe even the unknowable have been pursued as a fundamental part of these systems from their earliest stages. There is a good chance that the greater the impact that artificial intelligence comes to have in our lives the less we will understand how or why.

But that doesn’t sit well with many today. We want to know how AI works and how it arrives at the decisions and outcomes that impact us. As developments in AI continue to shape our knowledge and understanding of the world, what we discover, how we are treated, how we learn, consume and interact, this impulse to understand will grow. When it comes to explainable and transparent AI, the story of neural networks tells us that we are likely to get further away from that objective in the future, rather than closer to it.

Author: David Beer, Professor of Sociology, University of York

This article is republished from The Conversation under a Creative Commons license.

___________________________________

Image Credit: Image by kjpargeter on Freepik