How Do I Make A Bot?
6 min read

How Do I Make A Bot?

This is probably my most asked question. It's also the question I struggle giving a great answer for. This question is about the equivalent of asking a chef "How do I make food?". The chef might ask, "Well what do you know about cooking? What kind of food do you plan on making? What ingredients do you have? Are there any dietary restrictions to keep in mind?" and the list goes on. Writing a bot or any program for that matter is no different. What you need to learn depends on what you already know and identifying what your end goals are. This blog post aims to help you answer this question.

Level 0: What Language Should I Learn?

What language to learn is pretty subjective and different people will give you different answers. Bots can be written in pretty much any language as long as you have an HTTP client and there are bots on the market written in JavaScript, Go, Java, C#, you name it.

What I will say is that the user-facing web runs on JavaScript. This means to reverse-engineer the logic of a website's anti-bot will require some JS knowledge. JS also has a thriving ecosystem with millions of libraries that make your life a lot easier. Electron for example is a popular framework that allows you to build desktop applications using JavaScript. For these reasons I believe JavaScript is a great language to start out with but it's not mandatory. You can make bots in other languages as well.

Level 1: I chose a language but where do I learn it?

Where and how you choose to learn a language is completely up to your learning style. Some people are visual learners, some are auditory learners, some prefer books, etc..

YouTube can be a great resource to see what you're getting yourself into before you even start. The search JavaScript Tutorial alone can yield thousands of results.

There's also great sites that offer free online curriculums for learning how to code. Some of these sites include freeCodeCamp, Codecademy, CodeWars, etc... Great thing about these sites is that they offer realtime feedback so you can tell when you're doing things correctly. They also have big communities filled with people who want to learn or want to help.

If you're like me and find these approaches to learning new things to be boring then there's nothing wrong with just googling small pieces of information until you get your application working as intended. I do this a lot when learning new languages and libraries and it helps keep me engaged. The end goal for me usually isn't "I want to learn JavaScript" but instead "I want to create this thing I have in my brain." You can always pick up on things you missed at a later time.

The list of resources for learning a language is truly endless so choose what's best for you and have some goals in mind. The one thing to remember is that there is no such thing as mastering a language.

Level 2: I learned a language but I still don't know how to make a bot

This is perfectly valid. Learning a programming language does not mean that you know how to build everything in said language. There are still libraries to learn and fields to study. Before even typing a single line of code for you bot I'd recommend opening up your browser's development tools and poking around.

The DOM of example.com shown in Google Chrome's DevTools.

This is the DOM of example.com. Learning about what the DOM is, what HTML is, and some CSS can benefit you in your quest to make a bot. These are some of the building blocks of the web and can help you when you need to traverse a site's DOM to find a product or build a site of your own.

Google Chrome DevTools Network Tab

This is the network tab after clicking a link on example.com. This shows all of the inbound and outbound network requests made on the current browser tab. This is important because request based bots are usually created by replaying the necessary requests to complete a transaction. With the network tab you can see the requests made, what headers were sent, what headers were received, the responses given, etc..

Google Chrome DevTools Application Tab

This is the application tab of DevTools. More specifically, the cookies section. Cookies are a way for sites to store data on a user's machine while browsing a site. This is important because it's a way for sites to control state.

If you're still confused reading all of this, that's okay. You now have some things to research. These are a few things that build the web as you know it and it's important you learn it.

Document Object Model (DOM)

Cookies

Network Requests

Chrome DevTools

Level 3: I understand how the web works, how do I automate it?

Now we get to the fun part. Building the actual bot!

To do this you can decide if you wanna go the request based route or browser route.

A request based bot will make use of purely network requests and doesn't require a browser to function. All you need for the bare minimum is an HTTP client. Which client you use is dependent on your programming language. Some JavaScript examples include Axios, Got, and more. Python has http.client. Java has OkHTTP. The list goes on. Lots of languages, including the ones I've listed even have built-in HTTP clients.

A browser based bot will automate the checkout process by specifying commands to a web browser. Popular libraries for this include Selenium, Puppeteer, ChromeDP, and more. A benefit to this form of botting is that it may be easier to fool some anti-bots since it may just think you're a normal user using a browser. Although, there are ways for sites to detect people using browser automation.

At this point, you're writing code that tells the program to do what you the user would do, or what the browser would do.

For a more in-depth explanation check out my Anatomy of a Supreme Bot series.

Level 4: I can write a bot but reverse engineering is hard

Learning how to code doesn't mean you can reverse engineer code but it does make things a lot easier. Reverse engineering is all about learning how something works. This means you already performed some form of reverse engineering by snooping through the network requests and learning what requests, headers, and data is necessary. A lot of sites will deploy anti-bot scripts to try and give people a fair shot or stop attacks like credential stuffing.

These scripts are often obfuscated to prevent people from understanding how it works. This practice is known as security through obscurity. Although this can make things significantly harder, it doesn't make it impossible.

Original code on the left, obfuscated code on the right.

Reverse engineering obfuscated scripts is all about figuring out what the script is about. This doesn't mean the entirety of the script, but the crucial parts. There are tools to help beautify JavaScript code out there like Beautifier.io and de4js but will still not provide the original code.

In some cases you may need to build your own tools using AST manipulation or use a debugger to step through the code line by line and see what it's doing.

I wrote an article on writing a tool to deobfuscate some code so check it out for more details.

The most important thing to remember about reverse engineering is that it's normal not to know what's going on. That's exactly what the developers intended when they obfuscated their scripts. This step requires the most patience but is definitely the most rewarding.

Also, you'd be surprised at how much information you can get from just googling. Sometimes, the companies themselves write papers or release information into how their anti-bots work or sometimes people like me release tools and information for free for people like you to learn from.

Resources

I hope this gave some insight into how you can build your own bot. Starting out can seem very daunting but just remember that nobody is a master and there's always more you can learn.

To help you out, here is a dump of links to help you out along the way.

awesome-javascript

awesome-python

Chrome DevTools

Firefox DevTools

freeCodeCamp

Codecademy

HTTP

DOM

Charles

mitmproxy

Anatomy of a Supreme Bot

Tackling JavaScript Client-Side Security

Feel free to contact me on Twitter

Pce!