What is Cloud Computing and How Does It Work?

Cloud computing has fundamentally changed how we build and deploy applications. Instead of buying servers and managing data centers, you rent computing resources on-demand from providers like AWS, Azure, or Google Cloud. Let’s break down what cloud computing actually means, how it works under the hood, and what you need to know to build effective cloud systems.

What is Cloud Computing?

At its core, cloud computing means accessing computing resources (servers, storage, databases, networking) over the internet instead of owning and maintaining physical hardware yourself. Think of it like electricity: you don’t generate your own power, you plug into the grid and pay for what you use.

The key characteristics that define cloud computing:

On-demand self-service: Provision re…

What is Cloud Computing?

The key characteristics that define cloud computing:

On-demand self-service: Provision resources automatically without human interaction with the provider
Broad network access: Available over the network, accessible from any device
Resource pooling: Provider’s resources serve multiple customers with different physical and virtual resources dynamically assigned
Rapid elasticity: Scale up or down quickly based on demand
Measured service: Pay only for what you use, similar to utilities

Here’s what you need to understand: cloud computing isn’t just “someone else’s computer”—it’s a fundamentally different operational model that enables capabilities impossible with traditional infrastructure.

Cloud Service Models: IaaS, PaaS, and SaaS

Cloud services fall into three main categories, each offering different levels of abstraction and control.

Infrastructure as a Service (IaaS)

IaaS provides virtual machines, storage, and networking. You manage the operating system, runtime, and applications—the provider manages the physical infrastructure.

Examples: AWS EC2, Azure Virtual Machines, Google Compute Engine

Use cases:

When you need full control over the operating system
Running legacy applications in the cloud
Custom software stacks that require specific OS configurations

# Example: Launching an AWS EC2 instance
aws ec2 run-instances \
--image-id ami-0abcdef1234567890 \
--instance-type t3.medium \
--key-name my-key-pair \
--security-group-ids sg-0123456789abcdef \
--subnet-id subnet-0123456789abcdef \
--user-data file://startup-script.sh

With IaaS, you get flexibility but also responsibility. You’re managing OS patches, security updates, and configuration—just like physical servers, but without the hardware headaches.

Platform as a Service (PaaS)

PaaS abstracts away infrastructure management. You deploy your code, and the platform handles servers, scaling, and maintenance.

Examples: AWS Elastic Beanstalk, Google App Engine, Azure App Service, Heroku

Use cases:

Web applications where you want to focus on code, not infrastructure
Rapid prototyping and development
Teams without dedicated DevOps expertise

# Example: Deploying to Google App Engine
# app.yaml configuration
runtime: python39
entrypoint: gunicorn -b :$PORT main:app

instance_class: F2
automatic_scaling:
max_instances: 10
min_instances: 1
target_cpu_utilization: 0.65

# Deploy with single command
# gcloud app deploy

PaaS trades control for simplicity. You can’t customize the underlying OS, but you also don’t have to maintain it.

Software as a Service (SaaS)

SaaS delivers complete applications over the internet. You use the software—you don’t manage anything underneath.

Examples: Gmail, Salesforce, Slack, Google Workspace, Office 365

Use cases:

Business applications (email, CRM, collaboration)
When building the software yourself provides no competitive advantage
Rapid deployment with zero infrastructure management

Most people use SaaS daily without thinking about it. When you check Gmail, you’re using SaaS—Google handles everything from servers to application updates.

How Cloud Infrastructure Actually Works

Let’s go deeper into what’s happening when you use cloud services. I’ll use AWS as an example, but the concepts apply to all major cloud providers.

Virtualization: The Foundation

Cloud computing is built on virtualization—running multiple virtual machines (VMs) on a single physical server. Here’s the architecture:

Physical Server (Host Machine)
├── Hypervisor (VMware ESXi, KVM, Xen)
├── VM 1 (Customer A)
│   ├── Guest OS (Linux)
│   ├── Applications
│   └── Allocated Resources (4 vCPUs, 8GB RAM)
├── VM 2 (Customer B)
│   ├── Guest OS (Windows)
│   ├── Applications
│   └── Allocated Resources (2 vCPUs, 4GB RAM)
└── VM 3 (Customer C)
└── ...

The hypervisor creates isolated environments for each VM. Each customer thinks they have dedicated hardware, but they’re actually sharing physical resources securely.

Key insight: Cloud providers achieve economies of scale by packing many customers onto the same physical hardware while maintaining strong isolation between them.

Regions and Availability Zones

Cloud providers organize their infrastructure into regions (geographic locations) and availability zones (isolated data centers within a region).

AWS Region: us-east-1 (Virginia)
├── Availability Zone A (us-east-1a)
│   └── Data Center 1
│   └── Data Center 2
├── Availability Zone B (us-east-1b)
│   └── Data Center 3
│   └── Data Center 4
└── Availability Zone C (us-east-1c)
└── Data Center 5
└── Data Center 6

Why this matters: You can deploy applications across multiple availability zones to achieve high availability. If one zone fails (power outage, network issue), your application continues running in the others.

# Example: Deploying across multiple availability zones
from boto3 import client

ec2 = client('ec2', region_name='us-east-1')

# Launch instances in different AZs
for az in ['us-east-1a', 'us-east-1b', 'us-east-1c']:
ec2.run_instances(
ImageId='ami-0abcdef1234567890',
InstanceType='t3.medium',
MinCount=1,
MaxCount=1,
Placement={'AvailabilityZone': az},
# ... other parameters
)

In my experience building resilient systems, multi-AZ deployments are non-negotiable for production applications. A single AZ failure shouldn’t take down your service.

Storage in the Cloud

Cloud storage comes in several forms, each optimized for different use cases:

1. Block Storage (like EBS in AWS)

Acts like a hard drive attached to a VM. Low latency, suitable for databases and applications requiring consistent performance.

# Create and attach EBS volume
aws ec2 create-volume \
--availability-zone us-east-1a \
--size 100 \
--volume-type gp3 \
--iops 3000 \
--throughput 125

aws ec2 attach-volume \
--volume-id vol-0123456789abcdef \
--instance-id i-0123456789abcdef \
--device /dev/sdf

2. Object Storage (like S3 in AWS)

Stores files as objects with metadata. Highly scalable, durable, and cost-effective for large datasets.

import boto3

s3 = boto3.client('s3')

# Upload object
s3.put_object(
Bucket='my-bucket',
Key='data/file.json',
Body=json.dumps(data),
ContentType='application/json'
)

# Object storage is designed for durability: 99.999999999% (11 nines)
# This means if you store 10 million objects, you'll lose 1 every 10,000 years

3. File Storage (like EFS in AWS)

Network file system accessible from multiple instances simultaneously. Good for shared data access.

Performance characteristics (from production experience):

Storage Type	Latency	Throughput	Use Case
Block (EBS)	~1ms	Up to 2,000 MB/s	Databases, boot volumes
Object (S3)	~100ms	Scalable	Media files, backups, data lakes
File (EFS)	~1-5ms	Up to 10+ GB/s	Shared application data

Cloud Networking Fundamentals

Understanding cloud networking is critical for building secure, performant applications.

Virtual Private Cloud (VPC)

A VPC is your isolated network in the cloud. You define the IP address range, subnets, route tables, and network gateways.

# Example VPC architecture
VPC: 10.0.0.0/16
├── Public Subnet 1 (10.0.1.0/24) - AZ A
│   └── Web servers (internet-facing)
├── Public Subnet 2 (10.0.2.0/24) - AZ B
│   └── Web servers (internet-facing)
├── Private Subnet 1 (10.0.10.0/24) - AZ A
│   └── Application servers
├── Private Subnet 2 (10.0.11.0/24) - AZ B
│   └── Application servers
├── Private Subnet 3 (10.0.20.0/24) - AZ A
│   └── Database servers
└── Private Subnet 4 (10.0.21.0/24) - AZ B
└── Database servers

Security groups act as virtual firewalls:

# Create security group for web servers
aws ec2 create-security-group \
--group-name web-servers \
--description "Security group for web tier" \
--vpc-id vpc-0123456789abcdef

# Allow HTTP and HTTPS from anywhere
aws ec2 authorize-security-group-ingress \
--group-id sg-0123456789abcdef \
--protocol tcp \
--port 80 \
--cidr 0.0.0.0/0

aws ec2 authorize-security-group-ingress \
--group-id sg-0123456789abcdef \
--protocol tcp \
--port 443 \
--cidr 0.0.0.0/0

Load Balancing

Cloud load balancers distribute traffic across multiple instances for high availability and scalability.

Internet
|
[Load Balancer]
/    |    \
/     |     \
[VM 1] [VM 2] [VM 3]

The load balancer performs health checks and automatically removes unhealthy instances from rotation:

# Example: AWS Application Load Balancer with health checks
import boto3

elbv2 = boto3.client('elbv2')

response = elbv2.create_target_group(
Name='web-servers',
Protocol='HTTP',
Port=80,
VpcId='vpc-0123456789abcdef',
HealthCheckProtocol='HTTP',
HealthCheckPath='/health',
HealthCheckIntervalSeconds=30,
HealthCheckTimeoutSeconds=5,
HealthyThresholdCount=2,
UnhealthyThresholdCount=3
)

Auto Scaling: The Cloud’s Superpower

Auto scaling automatically adjusts resource capacity based on demand. This is where cloud computing really shines over traditional infrastructure.

# Example: Auto Scaling Group configuration
import boto3

autoscaling = boto3.client('autoscaling')

# Create Auto Scaling Group
autoscaling.create_auto_scaling_group(
AutoScalingGroupName='web-servers-asg',
LaunchTemplate={
'LaunchTemplateId': 'lt-0123456789abcdef',
'Version': '$Latest'
},
MinSize=2,        # Minimum instances
MaxSize=10,       # Maximum instances
DesiredCapacity=3, # Target instance count
VPCZoneIdentifier='subnet-1,subnet-2,subnet-3',
HealthCheckType='ELB',
HealthCheckGracePeriod=300,
TargetGroupARNs=[
'arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/web-servers/abcdef0123456789'
]
)

# Create scaling policy based on CPU utilization
autoscaling.put_scaling_policy(
AutoScalingGroupName='web-servers-asg',
PolicyName='scale-on-cpu',
PolicyType='TargetTrackingScaling',
TargetTrackingConfiguration={
'PredefinedMetricSpecification': {
'PredefinedMetricType': 'ASGAverageCPUUtilization'
},
'TargetValue': 70.0  # Scale when CPU > 70%
}
)

Real-world example from a project I worked on:

Normal load: 5 instances handling 1,000 requests/second
Traffic spike: Auto-scaled to 20 instances handling 8,000 requests/second
Cost: Only paid for extra capacity during the spike
Time to scale: ~3 minutes from detection to new instances serving traffic

This elasticity is impossible with traditional infrastructure where you’d need to pre-provision for peak load 24/7.

Cloud-Native Architecture Patterns

Building for the cloud requires different architectural approaches than traditional on-premises systems.

Microservices Architecture

Break applications into small, independently deployable services:

Traditional Monolith:
[Web + App + Database in one package]

Cloud-Native Microservices:
[Web UI] → [API Gateway] → [Auth Service]
→ [User Service]
→ [Order Service]
→ [Payment Service]
→ [Notification Service]

Each service can:

Scale independently
Use different technology stacks
Deploy without affecting others
Fail without bringing down the entire system

Serverless Computing

Take abstraction to the extreme: write functions, not servers. The cloud provider manages everything else.

# AWS Lambda function example
import json

def lambda_handler(event, context):
"""
Triggered by S3 file upload
Processes image, generates thumbnail
"""
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']

# Process image (resize, optimize, etc.)
thumbnail = process_image(bucket, key)

# Save thumbnail
save_thumbnail(thumbnail)

return {
'statusCode': 200,
'body': json.dumps('Thumbnail generated successfully')
}

# You only pay for execution time (billed per 100ms)
# No servers to manage, automatic scaling to zero when idle

When I use serverless:

Event-driven processing (file uploads, queue messages)
Infrequent workloads (scheduled tasks, webhooks)
Rapid prototyping
Unpredictable traffic patterns

When I don’t:

Long-running processes (max 15 minutes on Lambda)
Consistent high-volume traffic (instances are often cheaper)
Complex networking requirements

Infrastructure as Code

Define infrastructure using code instead of manual configuration:

# Terraform example: Define AWS infrastructure as code
provider "aws" {
region = "us-east-1"
}

resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"

tags = {
Name = "production-vpc"
}
}

resource "aws_instance" "web" {
count         = 3
ami           = "ami-0abcdef1234567890"
instance_type = "t3.medium"
subnet_id     = aws_subnet.public[count.index].id

tags = {
Name = "web-server-${count.index + 1}"
}
}

resource "aws_lb" "main" {
name               = "web-load-balancer"
load_balancer_type = "application"
subnets            = aws_subnet.public[*].id
}

# Deploy with: terraform apply
# Version control your infrastructure!

This approach brings software engineering practices (version control, code review, testing) to infrastructure management.

Cloud Cost Optimization

Cloud costs can spiral out of control without proper management. Here are strategies from real production environments:

1. Right-Sizing Instances

Match instance types to actual usage:

# Monitor actual resource utilization
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=i-0123456789abcdef \
--start-time 2025-12-01T00:00:00Z \
--end-time 2025-12-17T00:00:00Z \
--period 3600 \
--statistics Average

# If average CPU is consistently under 30%, downsize the instance

2. Reserved Instances and Savings Plans

For predictable workloads, commit to 1 or 3 years for significant discounts:

On-demand: $0.10/hour
1-year reserved: $0.065/hour (35% savings)
3-year reserved: $0.045/hour (55% savings)

3. Spot Instances

Use spare capacity at up to 90% discount for non-critical, interruptible workloads:

# Request Spot instances for batch processing
ec2.request_spot_instances(
SpotPrice='0.02',  # Max price you'll pay
InstanceCount=10,
Type='one-time',
LaunchSpecification={
'ImageId': 'ami-0abcdef1234567890',
'InstanceType': 't3.medium',
# ... other specs
}
)

4. Lifecycle Policies for Storage

Automatically move infrequent data to cheaper storage:

{
"Rules": [{
"Id": "archive-old-logs",
"Status": "Enabled",
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
}
]
}]
}

Cost breakdown from a recent project:

Original monthly cost: $12,000
After optimization: $7,200 (40% reduction)
Changes: Right-sizing, reserved instances, S3 lifecycle policies, unused resource cleanup

Conclusion

Cloud computing represents a fundamental shift in how we think about infrastructure. Instead of capital expenditure on hardware, it’s operational expenditure on services. Instead of over-provisioning for peak load, we scale dynamically. Instead of managing physical servers, we focus on application logic.

The key to effective cloud usage is understanding the abstractions: IaaS gives you control, PaaS gives you simplicity, and serverless gives you scale-to-zero economics. Choose the right level of abstraction for your use case.

Start with PaaS or managed services when possible—they handle undifferentiated heavy lifting so you can focus on what makes your application unique. Drop down to IaaS when you need more control. Use serverless for event-driven and sporadic workloads.

And remember: the cloud isn’t inherently cheaper than on-premises infrastructure. The value comes from agility, scalability, and not having to manage physical hardware. With proper architecture and cost management, the cloud enables capabilities that would be impossible or prohibitively expensive with traditional infrastructure.

For deeper technical details, consult the AWS Well-Architected Framework, Azure Architecture Center, and Google Cloud Architecture Framework. The NIST Cloud Computing Standards provide vendor-neutral definitions and guidance.

Thank you for reading! If you have any feedback or comments, please send them to [email protected] or contact the author directly at [email protected].

What is Cloud Computing?

What is Cloud Computing?

Cloud Service Models: IaaS, PaaS, and SaaS

Infrastructure as a Service (IaaS)

Platform as a Service (PaaS)

Software as a Service (SaaS)

How Cloud Infrastructure Actually Works

Virtualization: The Foundation

Regions and Availability Zones

Storage in the Cloud

1. Block Storage (like EBS in AWS)

2. Object Storage (like S3 in AWS)

3. File Storage (like EFS in AWS)

Cloud Networking Fundamentals

Virtual Private Cloud (VPC)

Load Balancing

Auto Scaling: The Cloud’s Superpower

Cloud-Native Architecture Patterns

Microservices Architecture

Serverless Computing

Infrastructure as Code

Cloud Cost Optimization

1. Right-Sizing Instances

2. Reserved Instances and Savings Plans

3. Spot Instances

4. Lifecycle Policies for Storage

Conclusion

Similar Posts