Zhihu Blue Ocean: a guide to gold mining of 2000W traffic opportunities
Brother Shi love 2020-11-25 16:15:49
Hello, friend , author ,5 Personal stationmaster , Entrepreneurship is on the way , Next I use 5000 The long text tells you , Get the right posture to know the flow , Ideas + Practice the unity of knowledge and practice , Pure dry goods, full of details , Please enjoy

According to the incomplete statistics of my local data ( Know Baidu top3 key word )

Baidu PC The total flow of keywords is :1 Billion 2743 ten thousand

The total traffic actually obtained is :1 Billion 2743 ten thousand x 0.15( Average click through rate ) = 1911 ten thousand

It's just looking at PC End , One is more than 2000W The traffic opportunity is in front of us

Everything depends on the premise of making money on the Internet , You have to get traffic first , And now traffic is more precious than gold

The truth is , I already have 5+ A friend of mine , In the last six months , Relying on this opportunity to get 6W~25W Unequal returns

And all we need is a pair of hard-working hands and a clear mind


Why is there such a traffic opportunity ?

What exactly does traffic opportunity refer to ?

How do we get traffic from it ?

below , Let me open the door of this flow for you

Reading guide : It's different from all kinds of things on the market “ Shuang Wen ”, This article follows my practical thinking , Used the right way to tell how to “ from 0 To 1”, You need a friend to read and think , Suggest a whole piece of time (10-20 minute ) To read

1. The game of capital

stay “ Rivers and lakes ” There is such a saying in the upper class , The general idea is :

Station master harvester , Baidu people cut off the traffic daddy stay 2019 year 8 Month is about to know , Kwai leader , And then , Baidu raises the right to know , Traffic performance is rising

When I saw this passage , Put a question mark on the amount of information received , Why? ?

A friend familiar with communication should know a basic principle :

For everything , We should try our best to pay attention to the judgment of facts , Not value judgment

Because the judgment of facts is conclusive , Be able to reach a consensus ; And value judgment depends on perspective and position , There can be many interpretations

The investment events here are facts , The later impact is the value description

However, with such a simple factual description, there is N A version , There's something wrong with the time , And the wrong investors

After verification , You will also find Baidu has invested Kwai soon , It could be another opportunity ?

Sometimes ideas are based on facts

So about value judgment , Is there really traffic growth ? Is it true to raise the right ?

Direct data validation ( Here, from the investment time point 2019.8 Take the data of love station for half a year , A slight error doesn't matter much ):

Word quantity data

 Know the blue sea :2000W Traffic opportunity gold mining guide

 Know the blue sea :2000W Traffic opportunity gold mining guide

 Know the blue sea :2000W Traffic opportunity gold mining guide

From the word size data, we can observe the following 2 spot :

from 2019 year 11 From the middle of the month , Traffic is growing by leaps and bounds , The amount of words varies from 30W Up to the 270W, near 10 times !

from 2020 year 7 Month begins , Traffic growth slows down , But it's still growing

that , How does this traffic grow ?

Data included

 Know the blue sea :2000W Traffic opportunity gold mining guide

 Know the blue sea :2000W Traffic opportunity gold mining guide

 Know the blue sea :2000W Traffic opportunity gold mining guide

By collecting the data, we can observe the following two points :

Although the data caliber is different , But during the period of traffic jumping growth , There is no growth trend in the number of entries , in other words , The original included page in the corresponding search terms under the ranking has been improved , Lifting weight real hammer

 Know the blue sea :2000W Traffic opportunity gold mining guide

When the included page can not cover more search words , Baidu to know the directional flow will reach the critical value , Belch ~

The above analysis can easily lead to a kind of “ A sense of nonsense ”, Because the analysis results are basically the same as the information received for the first time , Our brains can't process the same information

These are exactly two ways of thinking “ Induction ” and “ Deductive method ” The difference

If you don't do validation , Inductive thinking implicitly assumes that Baidu's right is real , This leads to the following actions all based on a hypothesis

And every step of deductive thinking is completely based on the condition that “ really ” The premise of , think about it , If the result of the analysis is the opposite, what will lead to

In this age of information explosion , We really need the ability to filter information , It is particularly important to keep thinking independently , But independent thinking is not about everything We have to put forward different opinions

Effective thinking must be based on sufficient accumulation of knowledge , Or it's blind thinking

If you're in a strange place , Learning from peers is still a good choice

So even though traffic growth is slowing down , But such a huge flow does not know the full amount of “ Swallow ”, There is still, and certainly will be, opportunities to use this dividend period to capture traffic and make money

Let’s keep going !

2. SEO?

Included ? ranking ? Raise the right ? If you have questions , Then you may not know much about SEO This field , Here is a brief description of

SEO By learning about search engines ( hereinafter referred to as SE) Rules to adjust the site , Make it rank higher on the target search engine , To achieve the purpose of obtaining traffic

Included :SE After crawling the web page , The behavior of caching on the server

The weight :SE The overall score of the site , Main basis of ranking

ranking : The location of the search results in the cache

above 3 All points are dynamic

that , How is a search traffic generated ?

The user enters the search term first (query), towards SE Make a search request ,SE The cached pages will be ranked by algorithm , And then it goes back to the front end ( browser ), Users observe search results , Click on the page from the search results according to your preference

A page wants to have traffic , You have to include it first ( By SE cache ), And then you have to be at the top (top10), And then someone has to search for ( Search volume ), Finally, I have to watch it, let people want to click ( title + describe )

In the click link , Zhihu has an important inborn advantage , After years “ Knowledge type ” The positioning and development of the platform , Users have a natural sense of trust in Zhihu brand , May lead to even if the ranking is not in the top 3 name , But still can get more than the same hit rate

This time, the two swords are in perfect combination , Baidu gives Zhihu to do directional traffic , And Zhihu has improved the efficiency of traffic , It's beautiful

3. Blue sea problem + Blue sea flow

So where are our opportunities ?

Hong Hong is short of money recently , So I searched Baidu “ How can I get the money ”( Real data , Just for example ), Found that a certain page ranked No 1

 Know the blue sea :2000W Traffic opportunity gold mining guide

Then little hand a shake, click to enter , Looking at the empty page , There's a subtle change in the facial expression

 Know the blue sea :2000W Traffic opportunity gold mining guide

What about kengdao? This is !

5 Years of online earning experience has given me a keen sense of smell , This is the opportunity to

So I got a million keywords + Know the data , Screening analysis revealed that , There are quite a few problem pages with search traffic, but there are the following situations :

The answer did not address the search requirements

The quality of the answers is low

Fewer answers

Before answering N There is little praise

So can we find this kind of problem , Write your own answers , And then get to the top of the list , To our own carrier ( WeChat / The official account, etc ) Diversion ?

The answer is yes !

in summary , There is a problem of search traffic and low competition , We are collectively referred to as “ Blue sea problem ”, The set of these problem flows , We are collectively referred to as “ Blue sea flow ”

Here's a little bomb , Friends, experience first ~(SE Rankings are dynamic , The actual search may be slightly different , In addition, considering openness , I've probably chosen an example )

 Know the blue sea :2000W Traffic opportunity gold mining guide

 Know the blue sea :2000W Traffic opportunity gold mining guide

BOOM! That's right is gay, The same question PC And mobile end ranking is 2, Mobile average monthly search volume 44.7W,PC Monthly average search volume 9.5W, Add up 50W, Ranking the first 2 The click rate of is 20% about , That is to say, this problem has to be solved in a month 10W Of SEO Traffic , What about the answers inside ?

 Know the blue sea :2000W Traffic opportunity gold mining guide

The first 1 Name only 58 A great , Do you have a chance to go up ? Yes ! Is there a way to cash in ?

4. Break through cognitive limitations

Some friends may not be able to sit here , I started to think about how the industry should operate

But , What if your industry doesn't have blue ocean traffic ? Why do you have to do it in a field you're familiar with ?

Flow master's thinking , It's always the big picture thinking , That is, to think from a global perspective

And this time , We want to analyze the distribution of the overall search traffic of Zhihu , Where is the blue ocean flow , We'll go there , It's not just about a problem , An industry

Even know that good things can be based on the blue ocean flow of thinking

Always remember that we have only one purpose , That's making money

meanwhile , This is what I wrote 【TACE】(Traffic ACE, Traffic master ) The main idea of this official account , But later I went to toss the project , Seldom send a text , Cough ….

I've talked a lot about it , That's because I want to put “ Avenue ” Make this level clear , That's why you do it ; and “ Law ” It's dead , The rules have changed , Methods fail immediately

for instance : When Tesla was born , The cost of the battery was lower than that on the market at that time 10 times ,CEO Why musk can do it ?

That's because his way is “ Physics thinking ”, Break things down into the smallest units and find solutions (TED There was a speech )

however ,80% People like to get the method directly ,why?

Dad said he heard from his grandfather , Hundreds of thousands of years ago , When humans were still hunting , To survive , The brain was born

And it takes millions of years for the brain to evolve , The history of human birth is only about 200000 years , That is to say, we still use “ Old brain ”

among “ Old brain ” One notable feature is the principle of minimum force , Humans are born to default to low brainpower behaviors , That is, you can use no brain, no brain , However, the degree of learning truth with brain is higher

Including me , Whenever I'm too lazy to use my brain , I ridiculed myself as a primitive man , Cough …

Then below , Let's step into “ Battlefield ”

5. Build a million level Thesaurus

Thesaurus is a collection of search words and word attributes

We collect as much as we can N Key words of channels , Because every channel or third-party platform has its limitations

In the eyes of traffic experts , There are not keywords lying in the thesaurus , But one by one RMB

From the perspective of search traffic , in the majority of cases , Adding words is equal to adding flow

You can find words that others can't find , You can get traffic that no one else can , To make money that no one else can

About the storage format of data , I suggest direct use of csv form , Local file storage with commas as as separator , Compared with mysql Class database , use Bash shell It's not too convenient to query and analyze

Take word channel :

5118, Love station , Home of stationmaster .

Next I use 5118 give an example

5.1 Get the parent word


Download Baidu respectively PC Keywords and mobile keywords , Separate the

 Know the blue sea :2000W Traffic opportunity gold mining guide

Friends who don't have members do Taobao by themselves , Some friends of enterprise version suggest full export

Next steps , We'll start with some programming knowledge :

Bash shell(Linux) + Python

Because of the demand of data calculation, conventional tools can no longer meet , So we're going to use “ mysterious ” Programming power of

I've developed it all myself , Part of it is simple Bash shell The command line is given directly in the article

But I believe that this will make 80% In the face of difficulties , But including me , Who is not from Xiaobai step by step ?

Programming is really not that hard ,trust me! If possible , Tell yourself to do that 20%

Also remember , We're not going to be a professional programmer , Programming ability can meet our current needs

2) Initial treatment

transcoding (GBK > UTF-8), because 5118 The data code given is GBK, and Linux In need UTF-8

Only output keywords , No other data is used , Because the accuracy of third-party data is really unsatisfactory , image 5118 The daily update volume of this order of magnitude needs to be less 1 Billion , The cost is here .

Before acquisition 100 First because of the low accuracy of the data , After that, we have to verify the data ourselves. Second, we mentioned dynamic ranking & Baidu has the right to , There's a time lag between getting the data and verifying the data , The rankings may have changed in time .

bash shell:

cat Input file name | iconv -c -f GB18030 -t utf-8| grep -Ev " Baidu is the whole area PC Keyword ranking list | baidu index |100 outside "|awk -F, '{print $1}' > Output file name

3) Keyword cleaning

Special symbols


A very easy step to overlook , Many people naturally trust keyword data produced by different channels ( Including Baidu ), however “ Traffic master ,” And “ Traffic master ” The number of searches is a million miles away



Year replacement , for example 2010 Year replaced with 2020 year

Chinese length >=2 ( Optional )

4) Go to sensitive words

Illegal words you know , Here we use DFA Algorithm , Average processing of a keyword less than 0.1s

5) duplicate removal

De duplication is a very important step , But it requires more memory , That is to say, the size of the file you want to duplicate cannot exceed the available memory size

The current solution is to use sort + uniq, First use split Split the target file , And then use sort Sort one by one , then sort+uniq Merging and removing weights

Although there is no significant reduction in the size of memory usage , But it improves computing efficiency

bash shell A short edition :

cat Input file name | sort | uniq > Input file name

bash shell Big data version :

#!/bin/bash# Command line arguments :#$1 Input file #$2 The output file basepath=$(cd `dirname $0`; pwd)echo `date` "[wordsUniq.sh DEBUG INFO] Start file segmentation ..."split -l300000 $1 ${basepath}/words_split/split_ # File segmentation echo `date` "[wordsUniq.sh DEBUG INFO] Start a single sort ..."for f in `ls ${basepath}/words_split/`dosort ${basepath}/words_split/${f} > ${basepath}/words_split/${f}.sort # Single sort doneecho `date` "[wordsUniq.sh DEBUG INFO] Start merging and de duplication ..."sort -sm ${basepath}/words_split/*.sort|uniq > $2echo `date` "[wordsUniq.sh DEBUG INFO] Delete cached data ..."rm ${basepath}/words_split/*

Usage method :

Stored as filename.sh file , Create... In the current directory words_split Folder , Then use the following command line , Input and output files can be specified path

sh Script name .sh Input file The output file

OK, Finished processing , Now we've got two very “ clean ” The parent word data of , That is to say, baidu PC Keywords and mobile keywords

5.2 Word expansion

Word expansion is the extension of the acquired parent word , Because a page may hit multiple related keywords

And then we can assume that , The words they get from third-party platforms are just what they can find , The subset of words that can be hit at present

We should try our best to find out the words in other parts , So as to accurately estimate the baidu traffic of a problem page

 Know the blue sea :2000W Traffic opportunity gold mining guide

Suppose there are now A and B Two questions , In your Thesaurus ,A hit 50 Key words , The total flow rate is 1W,B hit 10 Key words , Flow is 100

Then you may be B The problem is ignored , Only deal with A problem

But ,B The question actually hit 100 Key words , Flow is 10W

In this way, the information is poor due to the incomplete data , And then directly missed the opportunity to get these traffic

for instance :

 Know the blue sea :2000W Traffic opportunity gold mining guide

Extended , This page hits 47 Key words ,PC+ The total mobile traffic is 132W, Too many advertisements are forced to be warned by risk control , Here is the data of this part

 Know the blue sea :2000W Traffic opportunity gold mining guide

What about? , Are you beginning to feel the charm of data ? Cheer up ,Let’s keep going!

Because only do Baidu traffic , Now only use Baidu to expand

1) Related search + Drop down box word grab

Many people only know how to grasp these two channels , I don't know the nature of these two channels :

Related search

Related search is horizontal expansion , Most are related extensions across keyword topics , There may be a serious drift in the subject , To ensure relevance , Just grab one round

A drop-down box

The drop-down box is vertically expanded , Most add affixes to the end of keywords

The significance of clarifying the nature of channels lies in , Keywords, this kind of text data , There are and only these two directions of expansion , Other channel expansion methods are the superposition or variation of these two basic directions

Because the data produced by different terminals may be different , So we're going to take PC The parent word of mobile and mobile terminals , Expand the same port separately

namely PC The mother word grasps PC Related search + PC A drop-down box , Mobile parent word grabs mobile related search + A drop-down box

2) After Baidu promotion

Path is : register / Sign in > Enter search promotion > Promotion management > Key words Planner > key word

Sign up for free , In addition, you can use Ike SEM Tools / The bull SEO Tools etc.

3) Word processing

First, merge the words of each channel into different ports

bash shell:

cat file1.txt file2.txt > all.txt

Then repeat 【5.1 Get the parent word 】 The keyword cleaning and de duplication part of

5.3 Get keyword traffic

Also used to Baidu promotion background keyword Planner , But it uses “ Traffic query ” The function of

This is the official traffic data of Baidu , The previous data caliber was daily search volume , Now it's a monthly search , But it doesn't matter

Some friends may have questions , Why not focus on ranking and filtering data first , Reduce the pressure of data volume in the next step ?

Because the keyword planner can look up 1000 individual !10W Keywords only need to query 100 Time !

And the actual measurement proves that it is obtained once cookie It can be used across days , And keep 10+ Hours of valid login ( Promise me , Please be sure to be gentle )

1) Traffic data acquisition

By simulating login post Keyword data

2) Data filtering

Only search volume is reserved for each end >= N Key words ( Value customization )

You can do the filtering while you get the data , You can also take it apart and do another screening step , Here I suggest the latter , If the index is unreasonable, there is still room for re screening

bash shell:

cat file.txt | awk -F, '{$2>=100}' > file_new.txt

5.4 Get keyword ranking

Get the ranking data of each end separately , Only keep

https://www.zhihu.com/question/{ problem ID}

This url Characteristics , front 10 The key words of a name , And the storage problem url

5.5 Access to traffic

Keyword traffic is not equal to the actual traffic that the problem page can get

As mentioned earlier , Search traffic has a click step before it reaches the page , So we should calculate the available traffic , Formula for :

Access to traffic = Traffic X Click through rate

The click through rate is estimated according to the ranking , But Baidu seems to have never released click through data , Cough …

But we found one Sistrix stay 2020 year 7 month 14 Yesterday's google Click rate data , This data analyzes more than 8000 Million keywords and billions of search results

Although it's only the statistics of mobile terminals , But it doesn't matter

original text ( english ):


 Know the blue sea :2000W Traffic opportunity gold mining guide

After calculating the available traffic of each keyword , Our thesaurus is set up ,Niceee!

6. Data acquisition

The purpose of data acquisition is , We can take this N In the data of dimensions , It's difficult to make a preliminary judgment on a certain problem ( Corresponding 9.1 Data filtering )

Data is more refined than more , Too much data only interferes with judgment

Question views

The amount of concern ( Zhihu station traffic )

Question creation time

Number of answers

The first 1 Number of famous people

The first 1 Name answer word number

The first 1 First answer time

So far, , All the basic data we need are ready , Now you should get a Baidu + The keyword file that knows the data ,good job!

If you stick to it , I Believe , I would love to meet a friend like you ^_^

7. Data analysis

7.1 Keyword grouping

In the face of massive and disorderly data , We need to group by keyword , Gather related keywords and their corresponding question pages together

1)jieba participle

utilize python-jieba modular , Cut each keyword into N Words (term), such as “ Traffic master ” Will be segmented as “ Traffic ”+ “ master “, Words containing the same item are considered as a group

2) Words are heavy

Reference resources 【5.1 Get the parent word 】 To get rid of the heavy part of

3) Term data calculation

Match keywords with each term , And calculate the number of matching results ( Word frequency ) And the sum of available traffic

SEO Your friends may have a sense of familiarity , This approach is similar to that of search engines “ Inverted index ”, We actually use term Index , Classified into Zhihu URL

Let's grab some demo data :

 Know the blue sea :2000W Traffic opportunity gold mining guide

7.2 Manual classification

Grouping directly by word item is a simple grouping from the perspective of string , Simple and crude, but lacking semantic relations

such as “ Stock speculation ” and “ Stocks ” These two characteristics should belong to the financial category , But grouping by word items becomes two groups , So we should go through it manually in the end

After classification , Add the corresponding word frequency and the sum of available traffic , Get the total data

And then use mind mapping / Record in the form of a form , Here is an example of mind mapping

 Know the blue sea :2000W Traffic opportunity gold mining guide

But remember , Don't group for the sake of grouping , Words that have no obvious relevance should not be grouped together , Otherwise, it's trying to get yourself into trouble

8. Question screening

8.1 Data filtering

Now we can select a word item from the category that can get the most traffic , Before we finish 【6-7】 In the keyword file after , Use Bash shell perhaps stay Excel-csv Search inside “ Key words column ”, Find the key words that contain this term , And then use indicators to screen , Here are a few screening values for reference only

Question views ( auxiliary )

The amount of concern ( auxiliary )

Question creation time ( auxiliary )

Number of answers <=50

The first 1 Number of famous people <= 100

The first 1 Name answer word number <= 800

The first 1 First answer time ( auxiliary )

Access to traffic >= 100

Say a scene , After the rigid index screening , If a problem page is far less visited than the available traffic , Less attention , The problem was created recently , The first answer time is the latest , So this kind of problem needs to focus on mark once

however why? Friends may as well think for themselves

Well, I'll tell you , The number of people in each category is limited , If you reverse the above conditions , Then it's very likely that you've already miss Drop some traffic , So we have to have a sense of preemption

After filtering , Can be in accordance with the 【 Access to traffic 】 or 【 The first 1 Answer yes, number 】 And so on , The blue ocean problem is clear at a glance

8.2 Manual screening

Manual work is mainly used to solve the problem that data cannot judge , That is to say 1 Whether the answer to the first name does not meet the needs of the question , We mainly look for the following 2 Types :

1) Directly satisfy , But the user's implicit needs have not been met , There is room for expansion

give an example

Q:“ How often is the car maintained ”

A:“ I usually maintain it once a quarter ”

A(new):“ Different brands have different maintenance times , I'll list all the brands below xxx, maintenance program xxx, Oil selection xxx, What kind of pit to maintain xxx”

2) Indirectly satisfy

It's just a turn , Upper figure

 Know the blue sea :2000W Traffic opportunity gold mining guide

The answer explains the keystroke wizard , But it doesn't give you how to write this script

I believe it's here , You've found N Category N A question , And then immediately start to analyze the problem > Outline > xxxx…..

Stop! Please stop your behavior immediately , We have the last step

9. Traffic tracking

The last step of the long march , It's very important , It's very important , It's very important

We mentioned earlier 2 spot :

In Baidu promotion backstage - Key words in planners , The data caliber of traffic is month , And it's an estimate

SEO Page ranking is dynamic

This can cause instability in the results , Painstakingly doing the data , Wrote the answer , Got the ranking , It turns out there's no reading ?

So we need to monitor how the page traffic is growing , To determine if the page really gets traffic , How much traffic can you get , Finally decide whether to answer these questions

Monitoring time units can be days , Be careful every time N Hours , The monitoring time is judged by ourselves , Of course, the longer, the more accurate

for instance , Suppose that the available traffic of a problem is 15W, So the average daily available traffic is 5000 about , that 3 God ( Regardless of holidays ) The available traffic is 1.5W

Record the page views and compare , Especially up and down, as long as it's not big , Then we can put it on our response list

10. Last

We raise the perspective to the whole marketing level , We will find that blue ocean traffic acquisition is the first step in the whole marketing process , Other parts like answer ranking & Flow path & Cash in, etc

There are also many methods and techniques that can help us make better use of blue ocean flow , For example, data cross calculation , Advanced play, etc

But we need to expand the above , It's a big part of it again , Limited by time and energy , Let's talk about it next time

author :CashWar official account :TACE

source : Please indicate the source of Lu Songsong's blog reprint !

本文为[Brother Shi love]所创,转载请带上原文链接,感谢