Internet bot in the context of Client–server model

⭐ Core Definition: Internet bot

An Internet bot (also called a web robot or robot), or simply bot, is a software application that runs automated tasks (scripts) on the Internet, usually with the intent to imitate human activity, such as messaging, on a large scale. An Internet bot plays the client role in a client–server model whereas the server role is usually played by web servers. Internet bots are able to perform simple and repetitive tasks much faster than a person could ever do. The most extensive use of bots is for web crawling, in which an automated script fetches, analyzes and files information from web servers. More than half of all web traffic is generated by bots.

Efforts by web servers to restrict bots vary. Some servers have a robots.txt file that contains the rules requesting how bots should behave on website. Any bot that does not follow the rules could, in theory, be denied access to or be removed from the affected website. A website owner cannot force a bot to follow the rules or ensure that a bot's creator or implementer reads or acknowledges the robots.txt file. Some bots are "good", e.g. search engine spiders, while others are used to launch malicious attacks on political campaigns, for example.

↓ Menu

HINT:

In this Dossier

⭐ Core Definition: Internet bot
Internet bot in the context of Twitter
Internet bot in the context of Web crawler
Internet bot in the context of Web scraping

Internet bot in the context of Twitter

X, formerly known as Twitter, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, images, and videos in short posts (commonly and unofficially known as "tweets") and like other users' content. The platform also includes direct messaging, video and audio calling, bookmarks, lists, communities, Grok chatbot integration, job search, and a social audio feature (X Spaces). Users can vote on context added by approved users using the Community Notes feature.

The platform, then called twttr, was created in March 2006 by Jack Dorsey, Noah Glass, Biz Stone, and Evan Williams, and was launched in July of that year; it was renamed Twitter some months later. Twitter grew quickly; by 2012 more than 100 million users produced 340 million daily tweets. Twitter, Inc., was based in San Francisco, California, and had more than 25 offices around the world. A signature characteristic of the service initially was that posts were required to be brief. Posts were initially limited to 140 characters, which was changed to 280 characters in 2017. The limitation was removed for subscribed accounts in 2023. 10% of users produce over 80% of tweets. In 2020, it was estimated that approximately 48 million accounts (15% of all accounts) were run by internet bots rather than humans.

View the full Wikipedia page for Twitter

↑ Return to Menu

Internet bot in the context of Web crawler

A web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing (web spidering).

Web search engines and some other websites use Web crawling or spidering software to update their web content or indices of other sites' web content. Web crawlers copy pages for processing by a search engine, which indexes the downloaded pages so that users can search more efficiently.

View the full Wikipedia page for Web crawler

↑ Return to Menu

Internet bot in the context of Web scraping

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.

Scraping a web page involves fetching it and then extracting data from it. Fetching is the downloading of a page (which a browser does when a user views a page). Therefore, web crawling is a main component of web scraping, to fetch pages for later processing. Having fetched, extraction can take place. The content of a page may be parsed, searched and reformatted, and its data copied into a spreadsheet or loaded into a database. Web scrapers typically take something out of a page, to make use of it for another purpose somewhere else. An example would be finding and copying names and telephone numbers, companies and their URLs, or e-mail addresses to a list (contact scraping).

View the full Wikipedia page for Web scraping

↑ Return to Menu

Internet bot in the context of Client–server model

Internet bot Study page number 1 of 1

Play TriviaQuestions Online!

Skip to study material about Internet bot in the context of "Client–server model"

⭐ Core Definition: Internet bot

Internet bot in the context of Twitter

Internet bot in the context of Web crawler

Internet bot in the context of Web scraping