back ground

This article is part of a larger independent study (see below) on how product managers integrate machine learning into their products. It consists ofRyan DinglerMyselfMBA at the University of California at Berkeley, atVince LawWith the help of usinstructor.

The study aims to understand how product managers design, plan, and build products that support machine learning. To achieve this understanding, we interviewed 15 product development experts from various technology companies. Among the 15 companies represented, the market value of 14 companies exceeds 10 billion, 11 is publicly listed, 6 is B2C, and 9 is B2B.

Product Manager guides the ML series:

Identify opportunities

If you have worked in a product team in the past few years, you may have heard someone (probably PM) ask, "Can we not just solve this problem with machine learning (ML)?"

ML has a common concept that addresses the many challenges that product managers face. With virtual assistants on our phones and personalized recommendations after each purchase, it's hard to deny that ML is changing the way products are built and consumed. However, it is often difficult to know where to use ML in a product.

Why is this important?

When it comes to machine learning, it's important to find the next problem to solve. Data scientists and ML engineers have limited resources. Choosing the wrong project for your team is not only costly, but it also undermines morale, customer trust, and product failure.

Proper problem identification can help prevent your work from entering the cemetery of ML models and products.

What problems can be solved by machine learning?

In our research, we encountered a variety of ways companies use ML in their products. However, we noticed some of the common trends in all of these use cases and broken them down into four (and sometimes overlapping) problem areas: detecting anomalies, filtering information, adjusting content, and automating repetitive tasks.

These areas are intended to provide examples of what types of business problems may be good candidates for ML.

note:We have provided some examples that are similar to what we encountered in our research. The identified area is the area we observed directly in the interview, not a complete list.

Detect abnormality

ML is great for detecting patterns in data. This advantage can be leveraged to help users find data points that don't match typical patterns more easily. In our research, we found that companies use supervised learning, unsupervised learning, and even a combination of both for anomaly detection. The method of choice depends on the use case.

Detect abnormality


Supervision:The company regularly reviews its accounting books by spot checks on fraud or errors in hundreds of millions of terms. Due to the technical domain knowledge required for this task, these reviews are usually very manual. These manual audit results for many years provide excellent marker data for the supervisory model. These models produce outputs that are easy to interpret and interpret. In other words, the model may miss new types of fraud or errors that are not similar to the training set.

unmannedSupervision:In digital advertising, there is no control, and publishers have a great incentive to click on ads on their sites and have the company click on competitors' ads. Simple heuristics can prevent some fraudulent behavior, but unsupervised models can find new patterns in the data, making them excellent at detecting different types of fraud. Unsupervised models can identify bad participants based on data such as user IP address, transaction and time. However, these models often produce forecasts that are opaque and difficult to interpret.

Combination method:Supervised and unsupervised learning is often the most effective method of anomaly detection. Suppose an unsupervised model finds fifty users it believes to be fraudulent. A supervisory model can be applied to provide more details about why these users may be fraudulent (eg, the same IP address, similar timestamps, etc.). Other methods, such asSemi-supervision, can also improve performance.

Filter information

Users are often overwhelmed by products with too much information. There are two basic ways to solve this problem with ML: search and suggestions.

Filter information


Search is when a user tries to "extract" information. Sometimes users need to find information or objects, but don't know what to look for or where to find them. A simple search algorithm can use text matching and recently viewed items to find objects, but ML can do more. The ML model can consider hundreds or thousands of functions in search results in a way that rule-based searches cannot.


Search is at the core of the Dropbox experience.When users search for "machine learning" in their organization's Dropbox, a set of documents will be returned and then ranked.The ranking obtained is based on the query text above ("machine learning"); it also usesCorrelation score. This score takes into account the searchers, the people they interact with, and the files they have recently opened (freshness). Such models can be trained using readily accessible data such as past user searches and click results.

As access to large-scale computing increases, image, video and audio searches become possible. Even without the manual tag of Facebook or YouTube videos, the ML model can extract audio and use image recognition to index the video for searching. Similarly, Squarespace usesVisual searchTo help their users find sites with similar homepage images.

Recommended system

If the search is "pull", the suggestion is "push". As with search, the ML model is recommended to help users navigate information overload, but it is recommended to push personalized information to the user. The most common applications for recommending ML are social media news feeds and Amazon's "customers who buy this product are also buying." However, other products are also beginning to push personalized recommendations to users.


When a user opens Instagram, Reddit or LinkedIn, the ML model automatically provides a personalized and unconscious experience filled with updates of interested people or topics. Even if the ad can be embedded and personalized, it can be part of the referral experience.

The first thing you will see when opening the Nordstroms shopping app is "Products for you." It uses ML to help customers discover sales and new products they might like based on previous shopping history. Nordstroms can also help you find products that you haven't previously viewed or purchased, but so do other products you buy. Many retailers now use ML to get recommendations in their online presence.

Review content

More and more companies are relying on user-generated content in their products, and reviewing content is becoming increasingly important. Photos, text, audio, video, and even live streams need to be reviewed to comply with the rules set by each platform. It is impossible for a company to have a human host enforce these rules on all of its content.

Review content


YouTube Upload 500 hours of video per minuteThis requires more than 100,000 reviewers to work 40 hours per week. Instead, businesses such as YouTube rely on users and ML to tag content for moderator review. In this use case, ML should reflect what human reviewers do, because the data is human-marked, and success metrics are being manually reviewed. This reliance on human judgment continues to make the regulation of ML a fundamental challenge that needs to be addressed.

On the other hand, Reddit has very little adjustment on its platform. It optimizes voice freedom, so the platform is mainly to regulate illegal content. On the other hand, professional networks like LinkedIn believe that any inappropriate content is a huge problem. Platforms like LinkedIn can tweak content, and ML models can adjust the cleanliness of the platform without leaving the team or individual uncomfortable.

Automate repetitive tasks

The last problem we see in many ML applications is to automate repetitive tasks. These tasks include predicting the quality of leads, entering and sorting receipt data, or sending marketing emails. This area is usually a good place to start with ML because the tag data is very rich and there is direct time savings.


The receipt of the expense report is a very repetitive task.Through optical character recognition and MLget onautomation. The ML model takes a receipt image and automatically fills in the fields in the user expense report. Old manual reports provide the information needed to train this model, making data easy to access (although SaaS vendors often have to ask for access to customer data).

Even complex tasks such as email writing can be partially automated Smart writing for GmailThe previous word sequence in the sentence is predicted using a sequence of previous words entered by the user.

Automate duplicate content

The model also uses email headers and any previous emails in the string as input. This approach uses repeated and manual email authoring sections and automates them using ML.

Final thought

If the issue you are experiencing is consistent with one of these four areas (or another area not covered), please considerWhen starting a machine learning project readProduce Manager to-dos

ML technology and capabilities are always changing. The problem areas presented in this article are not comprehensive, nor do they imply that all problems can or should be solved with ML.

This article is transferred from medium, Original address