Why Vectors?
We all know what is AWS S3 right? Probably most well known AWS Service - amazing, cheap, durable object store that has been widely known and used for decades. Many questions arise when you first hear S3 Vectors: what are vectors, is it for mathematicians only? How is it related to AI? Why it has S3 prefix? I will answer all of that in a minute, keep reading!
What is vector store in general and why do we need it in AI? Let’s begin from how ChatGPT and similar LLMs work under the hood. When you write a query for example: "Tell me a joke?", message is divided and consumed as tokens. Token is semi-word or whole word like ["Tell", "me", "a", "joke"] and models have embedding layers that convert tokens into vectors, but the vector is context-dependent. "tell"…
Why Vectors?
We all know what is AWS S3 right? Probably most well known AWS Service - amazing, cheap, durable object store that has been widely known and used for decades. Many questions arise when you first hear S3 Vectors: what are vectors, is it for mathematicians only? How is it related to AI? Why it has S3 prefix? I will answer all of that in a minute, keep reading!
What is vector store in general and why do we need it in AI? Let’s begin from how ChatGPT and similar LLMs work under the hood. When you write a query for example: "Tell me a joke?", message is divided and consumed as tokens. Token is semi-word or whole word like ["Tell", "me", "a", "joke"] and models have embedding layers that convert tokens into vectors, but the vector is context-dependent. "tell" in a joke context vs. in a command context produces different vectors.
A vector is a mathematical object that has both magnitude (size) and direction, often represented as an arrow.
For example "tell" = [0, 2, 1, 10, …. , 3]. Since they have directions and sizes, they might be very close to each other or vice versa opposite to each other. Apparently, values are not random, they hold a semantic meaning of a word. For example "Espresso" and "Latte" would have very similar vectors because they are literally similar types of coffee(maybe not that similar in taste, i am not good at coffee, but you have got an idea 😅) .
Vector DBs are designed for similarity search using specialized ANN algorithms. Relational stores can do it, but not efficiently at scale. Your application queries the vector store → retrieves nearest chunks → provides them to the LLM as context. For example if I ask from chatbot LLM "Are there any discounts for students in your website" and inside vector DB I have such or similar information, LLM will retrieve it, maybe extend it and return the answer to me. It is just one use case of it, but you already see the real power of them.
So Vector store is augmental Knowledge base for your LLM and you can pour any information into it, without expensive Training process. It is also called RAG(Retrieval Augmented Generation) for that reason. By the way it can also be used with pictures or music, not only text. Anything that can be embedded into a vector will perfectly fit in there.
S3 Vector Store
The structure is similar to traditional S3, you create a bucket with unique name(across the region, not global), inside of which you create an index(also unique name within a bucket), each index might be configured according to your needs:
- Dimension - a numeric value between 1 and 4096 that determines how many numbers will be in each vector that’s generated by your vector embedding model(for example if Dimension is 5, vectors will look like [1, 2, 1, 4, 5])
- Distance Metric- choose either Cosine (which measures angular similarity) or Euclidean (which measures straight-line distance) as the distance metric to define how similarity between vectors is calculated during queries.
- Encryption-You have the option to use the bucket level encryption settings or override the encryption settings for the vector index. If you override the bucket-level settings, you have the option to specify encryption type for the vector index as Server-side encryption with AWS Key Management Service keys (SSE-KMS) or the Server-side encryption with Amazon S3 managed keys (SSE-S3).
Another important thing to consider is tagging, for each vector, we can put series of tags, like region, category, target, target whatever. It helps us to make search even more relevant, not only vector similarity, but also alignment on tags, AWS can generate tags itself, or we can set them manually, I example later, i did it myself. They are just metadata on each vector, not part of it!
Let me show you some examples right now, go to S3 -> Vector buckets
Here is a code that I have used to create vectors from data, put it into the bucket, run query. You can customize region, bucket name, index name and query itself to experiment. I have used AWS Cloud Shell in this example
import boto3
import json
region = "us-west-2”
bucket = "my-sales-bucket”
index = "my-sales-index”
bedrock = boto3.client("bedrock-runtime", region_name=region)
s3vectors = boto3.client("s3vectors", region_name=region)
# --- 1. Populate the vector index with sample sales data —
items = [
{
"key": "laptop-general-sale”,
"text": "10% off all laptops this weekend in our electronics section.”,
"metadata": {"category": "laptop", "audience": "all", "discount": "10%”}
},
{
"key": "laptop-student-sale”,
"text": "15% discount on lightweight laptops for students and university use.”,
"metadata": {"category": "laptop", "audience": "students", "discount": "15%”}
},
{
"key": "phone-sale”,
"text": "20% off latest smartphones with long-lasting battery.”,
"metadata": {"category": "phone", "audience": "all", "discount": "20%”}
}
]
vectors = []
for item in items:
resp = bedrock.invoke_model(
modelId="amazon.titan-embed-text-v2:0”,
body=json.dumps({"inputText": item["text”]})
)
embedding = json.loads(resp["body"].read())[“embedding”]
vectors.append({
"key": item["key”],
"data": {"float32": embedding},
"metadata": {
"source_text": item["text”],
**item[“metadata”]
}
})
s3vectors.put_vectors(
vectorBucketName=bucket,
indexName=index,
vectors=vectors
)
# --- 2. Query the index for "laptops for students” —
query_text = "Do you have any sales on laptops for students?”
resp = bedrock.invoke_model(
modelId="amazon.titan-embed-text-v2:0”,
body=json.dumps({"inputText": query_text})
)
query_embedding = json.loads(resp["body"].read())[“embedding”]
# Plain semantic search
response = s3vectors.query_vectors(
vectorBucketName=bucket,
indexName=index,
queryVector={"float32": query_embedding},
topK=3,
returnDistance=True,
returnMetadata=True
)
print("Top matches:”)
print(json.dumps(response["vectors"], indent=2))
# Optional: restrict to laptop category only
response_filtered = s3vectors.query_vectors(
vectorBucketName=bucket,
indexName=index,
queryVector={"float32": query_embedding},
topK=3,
filter={"category": "laptop”},
returnDistance=True,
returnMetadata=True
)
print("Laptop-only matches:”)
print(json.dumps(response_filtered["vectors"], indent=2))
The query was: "Do you have any sales on laptops for students?" The results are the following
Pay attention to distanceDistance is the most important parameter here in understanding how similarity search works. Its range is 0–1, 0 means perfect match and 1 means no match, based on this parameter you can use the result or omit it. For example you can consider results where distance is < 0.3 if you want to avoid irrelevant information. Now I change query to "Do you have any sales on laptops for university students?"
As you can see only one word "university" changed distances for both matches from 0.44 -> 0.44 and from 0.55 -> 0.57. Pretty interesting observation is not it? I hope it helps you to understand vectors better.
A Little Bit About Pricing
!!! Prices are approximate, for more detailed information read official docs Store - $0.06 GB/Months S3 Put - $0.2 per GB Query - $0.0040 per TB In my opinion it is not a lot if used wisely and can bring lot’s of impact
Conclusion
I am happy if you made to this moment! I hope you found something useful for yourself and of course go try it yourself and think about how it can be useful for you. In general vector store is an interesting topic, I wanted you to understand the "magic" behind it as well as make some real practice. If you have any questions feel free to ask below!