Praw You Are Doing That Too Much Try Again
Real time bot for monitoring subreddits in python.
Hello Anybody, Reddit is one of the biggest social news aggregation in United states. All the members can post about anything in related subreddit. If y'all are having a product which lots of people are using then getting to know what they are feeling almost the product is really important. Social media platforms like facebook, Twitter, reddit etc. are one of the nearly widely used by people. Reddit is also a discussion forum where people can talk and discuss virtually anything. Almost all products whether its microsoft Azure or AWS are having their subreddits on which they communicate with their users. And then, What if you lot want to know nearly the sentiments of users and get all the negative headlines from a subreddit so that you tin meliorate the product appropriately or if you want to interact with user in real time. Here I will exist explaining that how you tin can get notification in slack with the URL link of the headlines if any user is starting a negative discussion about a production related to you in real time.
At present, apart from that suppose you also want to see a time series graph that how frequent users are talking negative for your production and you lot want to visualize that in some class so that you can see how many negative headlines are there and how many people have upwards-voted on those headlines and how many people are engaged on those headlines.
I approximate by at present you must take got an idea what i am trying to achieve. Suppose you lot are a company who provide vocalization services to big games like PUBG, Fortnite, Phone call of Duty, Globe of tanks etc and you desire to come across what people are talking near online about the vocalisation services. If lots of people are up-voting a headline so it means its a generic issue and you tin piece of work on that in real time. Merely if there are few upvotes and then may exist the problem is at user terminate similar low speed or something.
Then, Lets start.
one.) Kickoff we need few details similar beneath which can be created by going on twitter and follow the educational activity on https://redditclient.readthedocs.io/en/latest/oauth/
you tin can utilise your ain reddit account for that there is not need of using official vivox account and i am not aware if we even have any official reddit account or non.
client_id='client_id', \
client_secret='client_secret', \
user_agent='user_agent', \
username='username', \
password='password'
ii.) To interact with slack we need to create a slack bot API which you can create by post-obit instruction on the below y'all tube:-
SLACK API = 'xoxb-44323424–234324243-dfsdfdsfsf'
3.)we will also be maintaining a csv file in which nosotros volition be maintaining customer twitter name information in the format like beneath:-
4.) we need some libraries to gear up up the whole procedure. So, please install the below libraries into your environment:-
5.)After that we will setup setup the reddit with praw method as praw is the api to collaborate with reddit data. if you want to read more about praw please go hither https://praw.readthedocs.io/en/latest/.
6.)Then we will read the CSV file into a panda's dataframe. The CSV file is having all the details of the subreddit and if you need to add or delete whatever subreddit, you tin do that in the CSV file and there will be no need to edit the python script.
seven.)After that every bit we are doing this dynamically and then we will be creating two variables one is to read the subreddit name and the other will exist customer name.
8.) At present, nosotros will read the headlines from the subreddit. Every bit nosotros are building a dynamic bot here and so we will be reading simply the new headlines and to do that we will follow the below line of code. Here first we are taking the subreddit name into a variable subreddit and and so we are reading just the new comments hither. I have passed here limit = grand which yous tin can pass as None too just you won't be able to get more than 1000 headlines and by default its 100.
9.) So, what if y'all demand all the celebrated headlines. We don't demand it here but if you want to download all of that yous tin can use the pushshiftapi ( https://github.com/pushshift/api) for that :-
10.) Now we will create a dictionary for they keys which we will exist reading from reddit. there are 95 different values reddit provides but we don't need all of them. As per our requirement i am using the below.
11.) Once we declare the structure of the lexicon now its time to read the information into that dictionary from the reddit:-
12.) Finally, we have the information and now we will convert that into a pandas data frame to perform further operation on that:-
13.) Reddit ever requite fourth dimension in epoch format and we need a general time stamp to read perform manipulation on that:-
14.) Now, its time to determine what should be the time interval of our bot, lets say yous want to practice this every xv minutes. Also, as reddit and yor timezone could be different so its e'er a wise thing to get for a common timezone and that is why i am converting both into UTC.
15.) Now, its time to filter out the headlines which you can do by post-obit the beneath code:-
xvi.) post information technology in the slack channel:-
17.) Now, What if we desire to shop the data into some DB and visualize that on grafana. I am using InfluxDB here and so merely supersede the step 15 with below lawmaking and the data will exist stored in InfluxDB.
18.) Setup Grafana for time serial visualization and y'all will go the graph liek below :-
In the in a higher place graph we can run into that at midnight some posted a headline and 247 people upvoted it and 115 people commented on it in only 15 minutes. So, that's a concern. The grafana dashboard will be looking like below :-
Cease Notes:-
- If you are setting xv minutes time interval and so make certain your script for all the clients is taking less time than that.
- we tin e'er change the script as per our needs like what kind of notification we want.
I as well have written a post for making a twitter bot. Please have a read :-
Please let me know if you got any error in the comment section. Delight add any valuable suggestion how Ican even improve this script. I am graduate student at Northeastern University and currently working every bit a Data Scientist at Vivox.
Please visiting the below Github repository for the code :-
Please connect on Linkedin:-
Source: https://towardsdatascience.com/real-time-bot-for-monitoring-subredditts-in-python-691cc692fdb5
0 Response to "Praw You Are Doing That Too Much Try Again"
Post a Comment