Compiled by Yun Zhao Planning | Yan Zheng Produced by | 51CTO Technology Stack (WeChat ID: blog) October 6, 2010, San Francisco. While people were still enjoying the excitement of iPhone 4 with its more powerful camera, an iOS photo-sharing app called “Instagram” appeared in the App Store. On the same day, it gained 25,000 first users. A week later, the number of downloads climbed to 100,000. From October 2010 to December 2011, Instagram's user base grew from 0 to 14 million in just over a year. And its founder, Kevin Systrom, did this with only three engineers. Let's go back to that magical moment and think from the perspective of engineers to see how they did it. In simple terms, they do this by following 3 key guiding principles and having a solid technology stack: keep things very simple, don’t reinvent the wheel, and use proven and reliable technologies whenever possible. 1. Early Basic ConfigurationInstagram's early infrastructure was run on AWS using EC2 and Ubuntu Linux. For reference, EC2 is Amazon's service that allows developers to rent virtual computers. To keep things simple, and because I like to think about users from an engineer’s perspective, let’s review the lifecycle of a user scenario session. 2. Front-endScenario review: The user opens the interface. Instagram was originally launched as an iOS app in 2010. Since Swift was released in 2014, we can assume that Instagram was written using a combination of Objective-C and other things like UIKit. picture 3. Load BalancingTo recap the scenario: When the app is opened, a request to fetch photos from the main feed is sent to the backend, where it reaches Instagram’s load balancer. Instagram uses Amazon's Elastic Load Balancer. They have 3 NGINX instances that are swapped in and out depending on their health. Each request first reaches the load balancer and is then routed to the actual application server. picture 4. BackendTo recap, the load balancer sends the request to the application server, which holds the logic to handle the request correctly. Instagram's application server uses Django, written in Python, and Gunicorn as their WSGI server. To recap, WSGI (Web Server Gateway Interface) forwards requests from a web server to a web application. Instagram uses Fabric to run commands in parallel on multiple instances at the same time. This allows code to be deployed in seconds. They run on more than 25 Amazon High-CPU Super Large machines. Since the servers themselves are stateless, they can add more machines when they need to handle more requests. picture 5. General Data StorageScenario recap: The application server discovers that the request requires data from the main feed. To do this, we assume that it requires:
1. Database: PostgresScenario review: The application server obtains the latest relevant photo ID from Postgres. The application server will pull data from PostgreSQL, which stores most of Instagram's data, such as user and photo metadata. Connections between Postgres and Django are pooled using Pgbouncer. Instagram sharded their data due to the volume of data they received (over 25 photos and 90 likes per second). They used code to map thousands of “logical” shards to a few physical shards. An interesting challenge that Instagram faced and solved was generating chronologically sortable IDs. The chronologically sortable IDs they generated look like this:
Scenario recap: Thanks to the time-sortable IDs in Postgres, the application server has successfully received the latest relevant photo ID. 2. Photo storage: S3 and CloudfrontScenario recap: The application server then fetches the actual photos matching those photo IDs via a fast CDN link so they load quickly for the user. Several terabytes of photos are stored in Amazon S3. These photos are quickly served to users using Amazon CloudFront. 3. Cache: Redis and MemcachedScenario: To get user data from Postgres, the application server (Django) uses Redis to match the photo ID with the user ID. Instagram uses Redis to store a mapping of about 300 million photos to the user ID that created them so it knows which shard to query when getting photos for the home feed, activity feed, etc. All Redis is stored in memory to reduce latency, and it is sharded across multiple machines. Through some clever hashing, Instagram was able to store 300 million key mappings in less than 5 GB. This photoID to userID key-value mapping is needed in order to know which Postgres shard to query. Scenario review: Thanks to efficient caching using Memcached, fetching user data from Postgres is fast because recent responses are cached. For general caching, Instagram uses Memcached. They have 6 Memcached instances at the time. Memcached is relatively simple to layer on top of Django. Interesting fact: Two years later, in 2013, Facebook published a landmark paper describing how they scaled Memcached to help them handle billions of requests per second. The user can now see the home page with the latest photos of the people he follows. picture 4. Master copy settingsBoth Postgres and Redis run in a master-replica setup, using Amazon EBS (Elastic Block Store) snapshots to frequently back up the system. 6. Push Notifications and Asynchronous TasksScenario Review: Now, suppose the user closes the app but then receives a push notification that a friend posted a photo. This push notification, along with the billion-plus other push notifications Instagram has sent, was sent using pyapns, an open-source, universal Apple Push Notification Service (APNS) provider. Scenario recap: The user loves the photo so much! So he decides to share it on Twitter. On the backend, tasks are pushed into Gearman, a task queue that outsources work to more suitable machines. Instagram has about 200 Python workers using the Gearman task queue. Gearman is used to perform multiple asynchronous tasks, such as pushing an activity (such as a newly posted photo) to all of a user's followers (this is called fanning out). picture 7. MonitoringScenario recap: The Instagram app crashed due to a server error and sent an error response. Three Instagram engineers were immediately alerted. Instagram uses Sentry, an open source Django application, to monitor Python errors in real time. Munin is used to graph system-wide metrics and alert on anomalies. Instagram has a bunch of custom Munin plugins to track application-level metrics, such as photos posted per second. Pingdom is used for external service monitoring, and PagerDuty is used to handle events and notifications. 8. Final Architecture Overviewpicture --postscript--19 months after Instagram was released, the number of active users exceeded 50 million, and the number of active users reached 100 million, reaching 130 million in June 2012. On October 25 of the same year, Facebook acquired Instagram for a total of US$715 million, and founder Kevin received a return of US$400 million. It is worth mentioning that Kevin is a self-taught programmer. With a background in management, he had a blank slate when he just graduated. When he was working in the marketing department of the social travel website Nextstop, Kevin began to take time out every night to teach himself programming. Instagram’s success has not only created one of the greatest success stories in modern Silicon Valley, Kevin’s self-taught journey has also become a catalyst for developers’ passion for programming. Reference Links:https://instagram-engineering.com/what-powers-instagram-hundreds-of-instances-dozens-of-technologies-adf2e22da2ad https://instagram-engineering.com/storing-hundreds-of-millions-of-simple-key-value-pairs-in-redis-1091ae80f74c https://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram |
<<: Does iOS 17.1 consume more power? Here comes the battery life test!
>>: Integrate UniLinks with Flutter (Android AppLinks + iOS UniversalLinks)
The Cosmobox Science Museum in Barcelona opened t...
The Chinese pangolin (Manis pentadactyla) is dist...
At the launch of Windows 10, Microsoft announced ...
A long time ago, I saw a joke like this. The teac...
September 16 is the International Day for the Pro...
1 What is a brand? It seems that there are very f...
In the gaming world, Sony and Microsoft can be co...
Smartphones are in great demand today, and all ma...
There are indicators but no system Numbers, no an...
Before the opening of CES 2015, Griffin, a well-k...
Are niche milks reliable? Are they really nutriti...
The Spring Festival is the peak season for kidney...
Recently, a paper written by Jia Jianping's t...
After nearly four years of rapid development, the...
Recently, a plant called "Opal Berry" h...