• No products in the cart.

Scraping Live Betting Data and Finding Surebets

In this article, we will write a program to make surebets by scraping live data from multiple websites.

In this article, we will learn about sports betting and how you can make the most out of it with just a few lines of code. We will start by learning a bit about surebets (or sports arbitrage). Then, move onto writing code for scraping from two sports betting websites and find surebets from there. We will package all of it up nicely into a docker container along with a UI and an API (in Flask)

An arbitrage in sports betting is when a bettor makes multiple bets on the same event to guarantee a profit no matter the result. It’s usually a result of different sports books offering different odds on the same event. This means we can divide up the money that we want to bet across multiple bookmakers to ensure that we end up with a profit no matter what.

The tutorial is divided into 5 parts briefly explained below:

  1. Scraping Live Data: We will start by writing two scripts for scraping live data using Selenium. The reason for using Selenium is because it can scrape dynamic data easily as compared to other libraries
  2. Cleaning Data: This includes making the format of odds from different bookmakers the same, matching team names, and stuff like that.
  3. Find Surebets: This will include writing code to find surebets and the potential profit
  4. Building UI and API: We will also build a UI and an API using Flask so that other people can benefit from our code
  5. Docker Container: Finally, we will package all of that up nicely into a docker container to make it more reusable

If you want to jump straight to code, jump to the GitHub repo: sports_betting_app

Scraping Live Data

For surebets to work, we need to scrape data at least from two websites. We will be scraping from Bwin and Betfair. Let us start by importing the required libraries

Importing Libraries

I will explain the use of each library as we use them.

Bwin

Let’s start by scraping Bwin. Let’s first open Firefox using webdriver and use Options to open it in headless mode, so it won’t interfere with other programs

Choose value from the dropdown

For our purposes, we will only be scraping odds from Both Teams to Score. For this, we first have to select the appropriate value from the dropdown. Let’s inspect the webpage and identify the appropriate selector

We need to select the ms-group-selector element and click it. After that, the Both Teams to Score value is the 10th element inside the dropdown. We will select that and click it as well.

We use WebDriverWait and EC to wait until the element is clickable. After executing the above code, we will get the following on the screen:

Scrape the odds

We will now scrape the highlighted portion above i.e, the odds that both teams will score in a given match

Let’s break down the above code:

  1. We start by selecting the main box that contains the team names and the odds.
  2. The odds are divided into rows for each match, and we select those rows
  3. We loop through each of the rows to scrape the odds and leave any odds that are empty
  4. As we are only interested in the Both Teams to Score odds, we will only append the first odd to the btts list
  5. Then we select the teams and finally close the browser

Store the scraped data

Finally, we write the data to pandas data frame and store it using pickle using the following code:

This is the final output that we get:

Betfair

We will start the same by importing the libraries and opening the website link.

Clicking the Accept Button

The Betfair website shows a pop-up to make sure you accept the cookies. So, we first need to locate that popup and accept the cookies to continue scraping. We will also initialize some variables for later.

Choosing the value from the dropdown

As we did in the case of Bwin, let’s choose the appropriate value from the dropdown using the market variable that we defined.

Scrape the odds

Now let’s write code to scrape the odds for each match

Let’s break it down

  1. We start by selecting the div containing the odds and the respective rows for each match.
  2. We loop through each row to extract the odds and the teams
  3. We store the teams and the odds in respective lists and close the browser

Store the scraped data

Finally, we store the data as before:

Here’s the output in this case:

Cleaning Data

Invalid Odds

Let’s now start cleaning the data. In the case of Both Teams to Score, there are some scenarios where either the odds are empty or there is only one odd present because the other odd is too small in comparison. Surebets can’t be found in those scenarios, so we will just convert them to 0\n0 to make sure the format is consistent.

Matching Teams

Now, many bookmakers name their teams in a different way. Some use full forms, some don’t. For this purpose, we use fuzzywuzzy library to compare team names across bookmakers. It gives a score ranging from 1 to 100. Let’s settle on 65 as a threshold. Even then, it may match teams that are not the same. You need to check for that yourself.

This is the output we get:

Find Surebets

Before we dive into code, let’s first explain a formula that we will be using to calculate surebets by using an example.

(1 / odds_1) + (1 / odds_2) < 1.0

╔══════════════╦══════════════╦═══════════════╗
║                      ║     Btts?    ║    Not btts?  ║
╠══════════════╬══════════════╬═══════════════╣
║ Bookmaker 1  ║      3       ║      1.9      ║
║ Bookmaker 2  ║      2.8     ║      3.1      ║
╚══════════════╩══════════════╩═══════════════╝

Either both teams will score or not in a given football match. If any set of odds satisfy the above formula, we have a surebet. In this case, the set of odds 3 and 3.1 satisfy the formula i.e, (1/3) + (1/3.1) = 0.65 < 1.0. So, let’s say you have 50$ to bet on and the way you decide the stakes is by solving the following two equations:

Stake_1 + Stake_2 = Total Stake  ->   eq. (1)

Odds1*Stake_1 = Odds2*Stake_2    ->   eq. (2)

By solving the above two equations (as we know the values of total stake and odds), you get to bet 26.21$ on bookmaker 1 and the remaining on bookmaker 2. There are two possible scenarios:

Scenario 1: You win on Bookmaker 1 but lose on Bookmaker 2

Profit 1= 26.21*3 — 26.21 — 23.79 = 28.63

Scenario 2: You win on Bookmaker 2 but lose on Bookmaker 1

Profit 2 = 23.79*3.1 — 26.21 — 23.79= 23.75

So, no matter what, you will always be gaining at least 23.75$ in this case.

Let’s now start by writing a function to calculate surebets:

Let’s break down the function a bit:

  1. We start by separating the odds into two separate columns for both teams will score or not respectively
  2. We then apply the formula described above.
  3. We select those which have sets that satisfy the above formula
  4. We put the surebets found into a dictionary

Now we need to write a function that will tell us the number of stakes that we need to put in and the benefit that we will be getting from each surebet

Let’s walk through this function:

  1. It accepts as arguments the set of odds that we found surebet for and the total stake.
  2. To calculate the different stakes, we solve two equations for the values.
  3. We then calculate the profit for each scenario and the percentage benefit

I will be setting total_stake to 50$ in my case. You can set it to whatever you’d like. We will loop through each surebet and calculate these values.

Let’s review the final piece of code in this section:

  1. We start by defining a table to show the values later on using the library,PrettyTable
  2. We loop through surebet dictionary and check for each the odds which has a surebet.
  3. We extract the odds from those and pass them to our beat_bookies function
  4. We then add a row to our table from the output of that function

The final table will look like this:

Congratulations! You have successfully written a program that makes sure you don’t lose on betting.

Building UI and API

So, we are done with the scraping and finding surebets. Now let’s build a UI and an API, so even those who don’t know programming can access this.

But for this, we will have to first organize our code into different modules. There will be 5 different modules in our case.

  1. Bwin Scraper (scrape_bwin.py)
  2. Betfair Scraper (scrape_betfair.py)
  3. Surebets finder (utils.py)
  4. API (api.py)
  5. UI (index.py)

This will be the structure of our directory. I will explain each file as we go:

Below are the first three modules (which we have covered above) reorganized

 scrape_bwin.py

scrape_betfair.py

Now let’s move onto the UI and the API

API

There will be a total of three endpoints in our API. Two for scraping Betfair and Bwin and one for finding the surebets and the profits.

We will be using Flask for this purpose as it helps in building simple APIs really fast. I will introduce you to the whole code of API and then explain it.

                                                                    api.py

Let’s dissect this file:

  1. We start by importing the required libraries. These include Flask and functions from the modules above. We alias the scraper modules to avoid confusion as their functions have the same names
  2. We define the app variable, which basically represents our web app
  3. We then use this syntax, @app.route . This basically is a decorator. All you need to know is whenever we hit this route, the function below that is called.
  4. We have only included the method GET as we are not passing any data from a form.
  5. For the scraping endpoints, we call the functions in the order they are defined in their respective modules.
  6. Notice the last line in each of the endpoints, jsonify. JSON is the standard data format in web applications and hence we change our output to JSON before sending.
  7. Our last endpoint calls the methods in utils.py to calculate the surebets.
  8. Lastly, we run the app on port 80.

Let’s run this file using the command python api.py :

Currently, there is no UI so we can’t view the app on our browser. Instead, we will use curl to access the API endpoints. We need to keep the above terminal running and open a new tab for this.

This is the response we get when we access the endpoint for scraping Bwin. Similarly, for Betfair, we would get the following response:

Finally, let’s access the surebets endpoint to see the response:

In this case, we just got one surebet. Note we kept total_stake equal to 150$ in our api.py file. That’s it for the API.

UI

The code in the index.py will be quite similar to our api.py file. The main difference would be that we will render templates for our users.

As you can see the code is almost the same as api.py except that jsonify is replaced with render_template. We have two templates.

  1. index.html. This contains the main page of the UI along with the page that shows the surebets

The code checks for surebets, if they exist it shows them in a table, otherwise, it shows the main page with links to various pages. We use url_for by passing it the function name for that route we defined in index.py and it redirects to that page.

  1. scrape.html. This contains the results from the scraping of either Betfair or Bwin

It just checks if it contains odds and then renders them

We will run it as before and access the webpage on localhost.

Following are the pages rendered by the app:

                                                            Main (home) page

We can click on each of these to navigate accordingly. Do remember that it’s going to take some time in case of scraping.

                                                                                  Scraping Page

The same would appear in case of Betfair. Let’s see the surebets page:

                                                           Surebets Page

At the time there was only one surebet, so the page displays that. It will show more in case there are several surebets present.

Congratulations! You successfully built an API and a UI for finding surebets and scraping betting websites.

Let’s containerize the application so other people can use it as well.

Containerize the App using Docker

To make our app more useful and easily accessible to people, we would containerize it using Docker. We will create a Dockerfile in the same folder as the index.py. Let’s examine the Dockerfile

The commands are executed one by one here.

  1.  First, we install python and create a new working directory and copy all the contents from the current directory to this one.
  2. Then, we run the commands to upgrade pip and install all the necessary libraries for this.
  3. Finally, we run the index.py file which runs the server of our app.

To make an image, first, go inside the folder where your Dockerfile exists, then run the following command:

This will take some time to execute. After it’s done you can run your app as follows:

-p tag defines the port that you want the application to run on. You can view the image running on the docker desktop and also view the application in the browser.

I have also pushed this docker image to docker hub so you can download and play around with this application here: sports_betting_app

Conclusion

In this article, we looked into scraping betting sites and finding surebets:

  1. We started by scraping betting sites, namely Betfair and Bwin using Selenium.
  2. We then cleaned the data to make it ready for analysis and finding surebets
  3. We then wrote some formulas to calculate the stakes and profits in each surebets
  4. After that, we created an API and a UI for our application using Flask
  5. Finally, we built a docker image so that anyone can use this application.

 

February 27, 2021
© 2021 Ernesto.  All rights reserved.  
X