Working with Strings
Symbols, tickers and dates are all text. Slice them, clean them and reshape them with Python's string toolkit.
- ·Strings as text
- ·Indexing & slicing
- ·Upper, lower & strip
- ·split & join
- ·Checking & replacing
- ·Building ticker strings
Look around a trading screen and you'll see text everywhere: symbols like RELIANCE, exchange codes like NSE, dates like 2026-06-25, order types, news headlines. To a computer all of this is one type - the string - and learning to slice, clean and reshape strings is a genuinely useful everyday skill. Data almost never arrives in the exact shape you want; strings are how you fix that. This chapter rounds off the basics, and it's a practical one.
A string is a sequence of characters
A string is just characters in a row, and each character has a position - an index - starting at 0. Square brackets reach in by position, and a colon inside them takes a slice (a range):
# A string is a sequence of characters - reach in by position with [ ].
symbol = "RELIANCE"
print("First letter:", symbol[0]) # R - counting starts at 0
print("Last letter :", symbol[-1]) # E - negative counts from the end
print("First three :", symbol[:3]) # REL - a "slice" from start up to 3
print("Length :", len(symbol)) # 8 - how many characters
# Slicing is a quick way to pull pieces out of a date string.
date = "2026-06-25"
print("Year :", date[:4]) # 2026
print("Month:", date[5:7]) # 06
print("Day :", date[8:]) # 25First letter: R Last letter : E First three : REL Length : 8 Year : 2026 Month: 06 Day : 25
The two ideas that trip people up are zero-based counting and negative indexing. Picture the word laid out with its index numbers:
So symbol[0] is R, symbol[-1] is the last E, and the slice symbol[:3] grabs REL - it stops just before index 3. That "up to, but not including" rule is everywhere in Python, so it's worth fixing in your mind now.
Strings are indexed from 0; negative indices count from the end (-1 is last). A slice text[start:stop] includes start but stops just before stop. Leave a side blank to go to the very beginning or end.
Cleaning and checking text
Real data is messy - stray spaces, mixed case, codes glued together. String methods, attached with a dot, tidy it up. Here are the ones you'll use weekly:
# Messy text in, tidy text out. Methods are attached with a dot.
raw = " reliance "
clean = raw.strip().upper() # strip removes outer spaces, upper capitalises
print("Cleaned:", repr(clean)) # repr shows the quotes, proving spaces are gone
# Checking what a string contains.
symbol = "NSE:RELIANCE"
print("Starts with NSE? ", symbol.startswith("NSE")) # True
print("Contains a colon?", ":" in symbol) # True
# Replacing part of a string (it returns a NEW string).
print("Swap exchange :", symbol.replace("NSE", "BSE")) # BSE:RELIANCECleaned: 'RELIANCE' Starts with NSE? True Contains a colon? True Swap exchange : BSE:RELIANCE
strip() shaves off surrounding spaces, upper()/lower() fix the case, startswith() and the in keyword check contents, and replace() swaps text out. Notice we reassigned the result - because of an important quirk:
Strings are immutable - you can't change one in place. Methods like upper() and replace() don't edit the original; they hand you a brand-new string. So you must capture it: clean = raw.strip().upper(). Writing raw.strip() on its own and expecting raw to change is a classic beginner slip.
Splitting and joining
Two methods are a matched pair and worth memorising together. split() breaks a string into a list of pieces; join() glues a list of strings back into one. They're how you read and write the comma-separated data you'll meet constantly:
# split() breaks text into a list; join() glues a list back into text.
line = "RELIANCE,TCS,INFY,HDFCBANK"
symbols = line.split(",") # split on each comma
print("Symbols list:", symbols)
print("Count :", len(symbols))
print("Third symbol:", symbols[2]) # INFY
# join() is split's mirror image - here with " | " between items.
print("Joined :", " | ".join(symbols))
# Building a Yahoo Finance ticker by adding the .NS suffix (a taste of Chapter 34).
nse_symbol = "RELIANCE"
print("Yahoo ticker:", nse_symbol + ".NS")Symbols list: ['RELIANCE', 'TCS', 'INFY', 'HDFCBANK'] Count : 4 Third symbol: INFY Joined : RELIANCE | TCS | INFY | HDFCBANK Yahoo ticker: RELIANCE.NS
We split "RELIANCE,TCS,INFY,HDFCBANK" on the comma to get four separate symbols, then joined them back with " | ". (Don't worry about lists yet - that's literally the next chapter.) And building a Yahoo Finance ticker is just sticking .NS on the end, a small taste of the symbol systems we'll untangle in Chapter 34.
Every Python 3 string is Unicode. That means a string can hold text in any writing system on Earth - English, Hindi (रिलायंस), Tamil, Bengali, Chinese - and even emoji, all natively, no special handling. For Indian markets, where company names and news arrive in many languages and scripts, this is quietly a very big deal: to Python, "रिलायंस" is just as ordinary a string as "RELIANCE".
Try it yourself
- From the string
"2026-06-25", use slicing to print it inDD-MM-YYYYorder:25-06-2026. (Combine three slices with+.) - Take
" nse:tcs "and turn it into the clean, uppercase"NSE:TCS"in a single chained line of methods. - Split the sentence
"buy 50 RELIANCE"on spaces. What's at index1, and what type is it - could you multiply it yet?
Recap
- A string is a sequence of characters, indexed from 0; negative indices count from the end.
- A slice
text[start:stop]includesstartand stops just beforestop. - Methods clean and inspect text -
stripupperlowerreplace,startswith, andin- but strings are immutable, so they return new strings. split()turns text into a list andjoin()turns a list back into text - the everyday tools for comma-separated data.
That wraps up Module 1 - you can install Python, run code, store and convert values, calculate, format output, and handle text. You officially know the basics. In Module 2 we level up to the structures that hold many values at once, starting with the most important one of all: the list, perfect for a whole series of prices.