Is AI and data really making tech companies stronger?

Machine learning is probably the most important and fundamental trend in technology today. Because the machine learning database consists of a lot of data. We often hear that companies that already have a lot of data will get even stronger.

Many companies are currently advancing in the AI field. Most people tend to think of companies such as Google, Apple, Facebook, Tesla, etc. These companies use their networks effectively to keep collecting more data. This data increases the precision of their algorithms. More accurate algorithms then further increase the number of users and this then leads to more data. This creates a virtuous circle of competition where data keeps getting piled up.

This data network effect makes the already strong tech companies using AI even stronger. This has led to the belief that most businesses will now be driven by data and AI.

But is the future really so simple?

Does having a lot of data mean stronger Tech Companies?

You can apply machine learning in various ways. You can use machine learning to do something new and important. And machine learning works better if you have more data. So how strong would it make a large company that already has a lot of data?

How far does the monopoly by the winners go?

It is a natural idea that the virtuous circle of data network effects makes the winners even stronger. More data leads to models with more accuracy, which means better products, more users, and even more data.

Machine learning requires a large amount of data. But this data needs to be specific to the problem you are trying to solve. GE has a large amount of data coming from gas turbines. Google has a lot of search data. Amex has a large amount of data on credit cards and fraudulent transactions.

Obviously, you cannot use gas turbine data to detect credit card fraudulent transactions. Nor can you use web search data to predict which gas turbines will fail!

You can use machine learning for things as wide as detecting fraudulent transactions or for recognizing faces. However, applications created with machine learning are still quite uncommon elsewhere.

Specific usage of data

Applications that use machine learning do only one specific thing. This is in line with what we have seen in the history of automation so far. Washing machines do not wash dishes or cook, they only wash clothes. Machines that can play chess cannot process your tax. A machine learning system that can translate cannot recognize cats.

In other words, AI needs specific data. The data you need for the application you create is specific to the task you are trying to solve. (Of course, research that transfers the part of learning by using different data is advancing every day.)

This means that the implementation of machine learning is not concentrated. Google does not really have all the data. Google only has Google-specific data, which allows Google to produce higher quality search results.

GE will collect even more data on engines. Vodafone will collect data on phone patterns and network planning.

Google is getting better and better about being a search engine. But that doesn’t mean Google is getting better with other things.

So, does that mean that the already large companies in their industry are getting bigger and bigger? Vodafone, GE, Amex already have a large amount of data on their respective areas. Does this imply a competitive advantage?

The answer is actually much more complicated. Consider the answers to some other questions…

Who owns the data in the first place?
How unique is the data?
If it is unique, at what level of uniqueness does it have?
Where is the data aggregated and analyzed further?

These answers are different for each case.

If the data network effect works…

Some data is unique to the business or product and gives a great competitive advantage. Data about GE engines will not help anybody to analyze Rolls-Royce engines. If it does help, they will not share it in the first place. If you can see an opportunity here, then this might even be an opportunity to create a new company.

Some data might be useful for certain use cases. It can be used across many companies and industries.

Card companies can use AI to detect suspicious phone usage. Companies with call centers can use AI that can guess if a customer is angry.

Many solutions arise to solve problems across industries. Certainly, data network effects work here.

If the data network effect stops halfway

Even in such a situation, after collecting a certain amount of data, more data may not be useful: The law of diminishing returns. Vendors may not need to keep collecting more data about each and every customer. Because the AI used in the product is already functioning well enough.

Data that can be collected easily does not have much competition

For example, if a large car maker uses machine learning to accurately predict tire punctures. The model here is based on a lot of data about punctured tires and not about the tires themselves. And collecting such data is not particularly difficult. This is one feature that anyone can do, so it does not give any competitive advantage.

The value offered by AI and machine learning

SQL is common for database management. Of course, you do not gain any competitive edge by using it when everybody is using it already. However, by not using it, you are losing any advantage you might have had. After all, your competitors have already adopted the technology.

The utilization of data is an important block for improving the processes of a business. Companies and startups that can use data efficiently perform better. In fact, the effective use of data might even lead to the creation of a different type of company. The success of Wal-Mart is because it can efficiently manage inventory and distribution functions using a database.

The question is, would the same thing happen with machine learning?

Certainly, there are fields where data networks are effective. However, at the same time, it is not just that you have to collect any data. Nor is it a simple thing that gives you any competitive advantage because of the large amount of data.

However, we often see companies that are strong with data working hard to collect even more. When collecting data becomes a priority, the preferred source is easy-to-collect data. In the end, such data is not particularly useful for your business. Nor does it give you any competitive advantage to the business. This leads to wasted efforts and money.

Conclusion

If you have data, you should consider what specific thing you can do with it before you collect it. Thinking about it after collecting data has no effect.

First, clarify the purpose of the business and what kind of problem you want to solve. Then consider machine learning and how the data will be useful. In fact, consider if it will even be useful in the first place!

Also, consider who actually needs that data. Making a strategy and proceeding with it makes sense only when you have considered these points.

Hence, it is better to make a prototype of a product or service and simulate it. You can perform simulations on fictional data. Once you have the results of the simulation, you can decide if you should actually collect the data.

With the sweet hope that someday it will be useful, the data we store will usually not see the light of the day. It’s like waiting for hope that if you buy a lottery every year you will win someday.

Data might be a valuable resource, but it need not really be like “gold” and “oil” as some people say it!

Artificial Intelligence, Data Science, Startup

ai artificial intelligence data science startup technology virtuous cycle