June 12, 2023 - 901 views|
Move beyond AI hype to real-world AI initiatives, using time-tested software development and data management best practices.
Generative AI, which is almost synonymous with ChatGPT these days, continues to make headlines around the world. “What are we doing with ChatGPT?” is a common question. The Wall Street Journal reports venture capitalists are pouring money into AI startups piggybacking on large language models, even when they lack clear business plans. Big players in the AI space all want to be seen as leading in the generative AI game.
What gets lost in all this noise is that proven open-source artificial intelligence/machine learning (AI/ML) models exist right now. These are not mysterious black boxes but are well-documented models. Companies can build successful solutions on these when they follow well-known software development and data engineering best practices.
That may not sound as exciting as a conversation with Cleopatra via ChatGPT. But the proven rigor of these practices is what will enable companies to drive faster, predictive and proactive decisioning by applying business discipline to their AI initiatives. In addition to setting realistic budgets and timelines, companies will work with transparent models and be able to reuse components to build on and extend their initial AI efforts for even greater utility and returns.
In our experience, only about 10% of corporate AI projects actually get deployed. And while 68% of businesses, globally and across industries, have adopted AI/ML, according to our recent research, many are struggling to scale their AI initiatives and realize business value from these projects. In our study, just 39% of respondents said AI/ML had contributed to significant business value. Such limited corporate use reflects the sense that AI is an experimental technology to play with vs. applying a structured, disciplined approach.
A disciplined approach uses proven data and software engineering frameworks as the foundation for training open-source AI/ML models. Because the frameworks are well established, we know their associated timelines and costs. The business applications of different open-source AI models are clear, such as whether they are better suited for finding and predicting patterns in images or in text. By building a well-structured and trained AI model, businesses can also generate desired results faster. Further, they can apply the model to other situations by training it with different data.
One of our clients, an aquaculture major in Norway, wanted a faster and more accurate way of understanding fish development. The company was curious about using computer vision to track growth and detect diseases and malformations. We helped the client train a convolutional neural network to identify salmon by weight and length. This type of open-source model excels at categorizing images, essentially “encoding” them in its internal connections.
With proper design, such as that enabled by our Learning Evolutionary Algorithm Framework, the model will be able to recognize additional patterns. Now, when the client wants to identify additional fish, it does not need to build a new model. Instead, it can use its existing model “off the shelf,” training it with different data sets about other fish species.
Data engineering tasks are the foundation of robust AI/ML solutions built on open resources. Once an organization develops an AI/ML model, the model can be used off the shelf and retrained with new data sets. Companies can create their own repository of off-the-shelf AI/ML models specific to their unique needs.
The following steps can help companies create an internal repository of off-the-shelf AI/ML models that are both reusable and extensible:
Organizations that follow these steps and apply the widely known lessons from software development and data engineering will see more use and value from AI/ML more quickly than those that either dabble with AI technologies or leap into applying large open language models like ChatGPT.
Our view is that it is very well possible to create an AI/ML learning service catalog based on the type of prediction to be done by category (i.e., image recognition classification, text recognition and classification, time series predictions, structured data predictions) and their relative complexity (driven by data engineering).
The above factors, coupled with the structured approach to industrialize (proof of value, MVP and production) provides a good idea of the effort and budget needed. Of critical importance is also a solid MLOPS architecture that allows continuous integration/continuous development (CI/CD) and streaming models in production.
Deployment of AI/ML solutions is, despite the appearance, very similar to other software engineering disciplines. Following a structured approach, within a well-defined MLOPS architecture, owned by a dedicated center of excellence, forms the basis for reusable data products that can be developed with a service catalog mindset.
The call to action is clear: Following a structured approach will guarantee scalability and industrialization.