0% found this document useful (0 votes)
4 views18 pages

CT - 1

The document introduces datasets, variables, iterators, and filtering concepts, explaining how to count and sum data systematically. It discusses flowcharts for visualizing algorithms and the importance of data types for ensuring data integrity. Additionally, it covers complex data types like records and lists, illustrating how to bundle data elements together for structured representation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views18 pages

CT - 1

The document introduces datasets, variables, iterators, and filtering concepts, explaining how to count and sum data systematically. It discusses flowcharts for visualizing algorithms and the importance of data types for ensuring data integrity. Additionally, it covers complex data types like records and lists, illustrating how to bundle data elements together for structured representation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

🎏

CT - 1
Introduction to Datasets
Report cards

Data is inside these report cards. Each card has unique number.

Shopping bills.

Simplified form of these bills. Also has unique serial number.

Collection of words. Each word on a separate card.

Concept of Variables, Iterators, and


Filtering
Mark each card. So that you dont repeat counting when you're interrupted or
forgot the number.

Keeping track of the number by adding 1 to the previous number.

Count = 0 (initiated with zero)

Counted till 29 (since its started with 0, we have 30 total cards)

This is the definition of an Iterator.

You go through the sequence of objects that you have and for each object you
keep a count.

You do something with that object here we are just keeping count.

Iterator itself has an initialization. In this case count = 0 is the intialization.

We're moving cards from one pile to another pile, and increasing the count by 1
for each card which is counted.

When do we stop this process?

CT - 1 1
When we run out of counting objects.

You can pick card randomly.

Count is a Variable. Count is a quantity.

This variable keeps changing.

In the middle, the value of count is the number of cards that have been moved
from one pile to the other so far

Let's find out average marks in Mathematics.

Systematically we want to sum all the marks.

We can use the same iterator idea.

Initialize sum = 0

sum = 2171

count = 30

average = 72.36 marks

Instead of just incrementing the sum variable, we're accumulating maths marks.

We can keep track of count and sum simultaneously.

In a single round, we can keep track of 2 variables.

We can keep track of any number of variables as long as we systematically


update them.

Paragraph of words

Collect all verbs

Count verbs

This is what we call filtering.

Filtered and then counted. Can be done simultaneously.

It's called Filtered iteration.

We can do 2 things. Verb count and pronoun count.

It is the number of cards that have been moved from one pile to the other so far

CT - 1 2
Iterations using Combination of filtering
conditions

Classroom dataset.

How many girls from Chennai?

Separate all girls - one step of filtering

Now, look at the chennai - second step of filtering

Now, count all the girls - 3rd step of filtering.

To do it in one step. Two conditions. AND of two conditions. Chennai condition


and female condition

Chennai girls = 0.

In one iteration, we satisfied both conditions.

Checking birthdate - born first half of year

More than or equal to 1st Jan and less than or equal to 30th of June.

Exactly half students.

Give an algorithm to compute and back subjective opinions.

Introduction to Flow Charts


It is important we write down the step-wise procedures in a formal form.

Two ways to do it:

Flow Charts

PseudoCode

Flow charts - diagramatic representation of algorithm we are going to use.

CT - 1 3
Process or activity - to write down a set of operations that can change the value
of some data we have (variables).

Flowline or arrow - To connect other boxes. Shows the order of execution of the
program steps.

Decision - Can make a decision in which the direction the program should take.
Determines which path the program will take.

Terminal - Indicates the 'Start' or 'End' of the program

Flowchart for counting cards

Iterator should repeat the following steps.

Flowchart for Sum of Maths Marks

Modify the flowchart. How to add up all the maths marks.

CT - 1 4
2 flowcharts very similar.

Is there a general flow chart for iteration?

Generic flowchart for iteration:

CT - 1 5
Flowchart for sum with Filtering
How to convert such a procedure in which there is filtering goin on into a flow
chart.

Doing a flowchart for sum in which filtering is done.

How do we modify this to do sum of boys maths marks?

Tutorial

CT - 1 6
Sanity of Data
There's a valid range in which a serial number should belong to.

Just check the validity of data whether it matches given context or not. Etc.

CT - 1 7
Introduction to Datatypes

This leads us to the concept of a Data Type

CT - 1 8
Data type - to ensure sanity for data.

Boolean: Has only two values: True or False

Operation: AND, OR

Result Type: Boolean

Integer: Range of values is -3,-2,-1,-,1,2,3,...

Operation: + - x divide, <, >, = (comparison)

Result Type: Integer, Boolean

There are constraints on division. For result to be integer, first number has to
be divisible by second.

Or we change the definition to take only the quotient and drop the remainder

Character: Values - Alphanumeric: A B ... Z, a b c... z, 0 1 ...9, Special


characters: ., ; : * / & $ # @ ! ...

Operation: = (equality)

Result type: Boolean

Subtypes of basic datatypes

CT - 1 9
We want to see whether we can restrict these data types to further narrow range
of values and further constraints on the operations so that what we wanted for
our datasets is implemented.

Subtypes of integer:

None of the integer operations make sense for the SeqNo data type.

Another subtype is Marks

Range of values is: 0-100

Total: 0-300

Operation: + - ( x and div doesn't make any sense) , < > = (comparison)

Result type: Marks , Boolean

Count:

Range of values: 0, 1, 2, ...

Operation: + -, < > = (x div doesn't make sense)

Result: Count, Boolean

Character Data Type and its subtype

CT - 1 10
Values: A B .. Z a b .. z 0 1 .. 9

Special characters: .,: ;

Operation: =

Result type: Boolean

Gender:

Values: M or F

Operation: =

Result: Boolean

String Data type and its subtype

Values: Any sequence of characters, no restriction in the length of


characters.

Operation: char in string is equality (=)

Result type: Boolean

Names

Values: strings with no special characters

City

Values: Strings with no special characters

Words

Values: Strings with alphanumeric and . , : ;

Category

Values: Can take only one of the following values: Noun, Verb, Preposition,
Adjective, ...

Transformation of sub-datatypes
Its not obvious how to create a subtype.

Integer

Dates:

CT - 1 11
What is date?

4th June, 31st January

Is that a string?

Many operations on dates cannot be done if we think date as a string.

Adding 2 dates.

Operations on dates: print, <, >, =

Multiplication doesn't make any sense.

Nice to make date as integer

Range of values: 0,1,2,3,...365 (this is a leap year)

Non-leap year: 0,1,2,...364

You can do integer operations now. Add, sub, compare.

Result type: String, Boolean

Marks

What about fractional marks, ex, 62.5

Round it off, a way to make things simple but we lose information.

Operation: print, + - < > =

Result: String, marks, boolean

Dealing with fractional values

CT - 1 12
Amount

Range: 0,1,2,3...

Operation: print, + - < > =

Result: String, amount, boolean

Quantity

Operation: print, + - < > =

Result: String, quantity, boolean

Non Graded
1. 31 + 28 + 31 + 30 + 31 + 30 + 31 + 15 = 227, but we initialized it from 0, so
answer is 226

2. print(7050), 7050/100 = 70.5

Introduction to complex datatypes like


records and lists
Boolean, Integers, and characters.

Various subtypes

CT - 1 13
Strings and subtypes of strings.

Records and lists - Data types packaged together in a bundle.

Two ways of bundling

Record (also called struct or tuple):

One way of bundling is record.

Sometimes in some programming languagues it is called Struct, in C, C++,


or sometimes a class or it might be called as tuple in mathematics.

Record basically is just a dataset or basically data item which has number of
elements. These elements are called as fields.

In Marks card: serial number, name, gender, DOB, tow/city - these are the
fields of marks card.

How do you represent the data type of this marks card?

The valid values, the range of the valid values that the marks card can take
cannot be independent of the values that the fields can take.

So trying to define what are the fields in this particular data item and what
are the values that fields can take.

So, it is enough to basically write down all the fields and the data types of
those fields.

CT - 1 14
Marks card is a record data type. Which contains fields.

Each of these fields has a specific data type.

So the values for each of these fields is constrained by the kind of data type
that you are defining for that particular field.

Sanity is ensured by defining the data types of the field.

WordInPara

A record with four fields.

List

Set of all marks cars, is there a data type that can basically, that can
represent that?

For one card, I can write a record. But for the entire list, or entire set of
cards.

CT - 1 15
So, sequence of data elements is called a list.

String has a sequence of characters so string is basically a list which


contains list of characters.

Marks card list is just a sequence of marks card record data types.

A paragraph word list is just a sequence of WordInPara record data type.

So, the set of all the marks cards, is a marks card list.

Set of all the words in the paragraph is a paragraph word list and similarly
the set of all shopping bills is the shopping bills list.

What is the record data type of shopping list?

I dont know how to define. Because I dont know how many elements are
present in that item.

Because it looks like there are sequence of elements there, it looks like a
list.

CT - 1 16
What is it a list of?

It has to be list of items.

So if we take one item and make a record data type out of it, then it will be a
list.

What does one item look like?

It has basically the item name, category name, quantity, and price.

CT - 1 17
Item > Itemlist

In this way, I have basically created a record data type for shopping bills. So,
note that the record data type for shopping bill contains a number of fields,
one of those fields is a list, that list basically is a list of records,

so shopping bill is basically a record of a list which contains a field which is


the list of another record which is item, which in turn has a number of fields.

CT - 1 18

You might also like