🎏
CT - 1
         Introduction to Datasets
           Report cards
           Data is inside these report cards. Each card has unique number.
           Shopping bills.
           Simplified form of these bills. Also has unique serial number.
           Collection of words. Each word on a separate card.
         Concept of Variables, Iterators, and
         Filtering
           Mark each card. So that you dont repeat counting when you're interrupted or
           forgot the number.
           Keeping track of the number by adding 1 to the previous number.
           Count = 0 (initiated with zero)
           Counted till 29 (since its started with 0, we have 30 total cards)
           This is the definition of an Iterator.
           You go through the sequence of objects that you have and for each object you
           keep a count.
           You do something with that object here we are just keeping count.
           Iterator itself has an initialization. In this case count = 0 is the intialization.
           We're moving cards from one pile to another pile, and increasing the count by 1
           for each card which is counted.
           When do we stop this process?
CT - 1                                                                                           1
         When we run out of counting objects.
         You can pick card randomly.
         Count is a Variable. Count is a quantity.
         This variable keeps changing.
         In the middle, the value of count is the number of cards that have been moved
         from one pile to the other so far
         Let's find out average marks in Mathematics.
         Systematically we want to sum all the marks.
         We can use the same iterator idea.
         Initialize sum = 0
         sum = 2171
         count = 30
         average = 72.36 marks
         Instead of just incrementing the sum variable, we're accumulating maths marks.
         We can keep track of count and sum simultaneously.
         In a single round, we can keep track of 2 variables.
         We can keep track of any number of variables as long as we systematically
         update them.
         Paragraph of words
         Collect all verbs
         Count verbs
         This is what we call filtering.
         Filtered and then counted. Can be done simultaneously.
         It's called Filtered iteration.
         We can do 2 things. Verb count and pronoun count.
         It is the number of cards that have been moved from one pile to the other so far
CT - 1                                                                                      2
         Iterations using Combination of filtering
         conditions
           Classroom dataset.
           How many girls from Chennai?
           Separate all girls - one step of filtering
           Now, look at the chennai - second step of filtering
           Now, count all the girls - 3rd step of filtering.
           To do it in one step. Two conditions. AND of two conditions. Chennai condition
           and female condition
           Chennai girls = 0.
           In one iteration, we satisfied both conditions.
           Checking birthdate - born first half of year
           More than or equal to 1st Jan and less than or equal to 30th of June.
           Exactly half students.
           Give an algorithm to compute and back subjective opinions.
         Introduction to Flow Charts
           It is important we write down the step-wise procedures in a formal form.
           Two ways to do it:
               Flow Charts
               PseudoCode
           Flow charts - diagramatic representation of algorithm we are going to use.
CT - 1                                                                                      3
         Process or activity - to write down a set of operations that can change the value
         of some data we have (variables).
         Flowline or arrow - To connect other boxes. Shows the order of execution of the
         program steps.
         Decision - Can make a decision in which the direction the program should take.
         Determines which path the program will take.
         Terminal - Indicates the 'Start' or 'End' of the program
         Flowchart for counting cards
         Iterator should repeat the following steps.
         Flowchart for Sum of Maths Marks
         Modify the flowchart. How to add up all the maths marks.
CT - 1                                                                                       4
         2 flowcharts very similar.
         Is there a general flow chart for iteration?
         Generic flowchart for iteration:
CT - 1                                                  5
         Flowchart for sum with Filtering
           How to convert such a procedure in which there is filtering goin on into a flow
           chart.
           Doing a flowchart for sum in which filtering is done.
           How do we modify this to do sum of boys maths marks?
         Tutorial
CT - 1                                                                                       6
         Sanity of Data
           There's a valid range in which a serial number should belong to.
           Just check the validity of data whether it matches given context or not. Etc.
CT - 1                                                                                     7
         Introduction to Datatypes
           This leads us to the concept of a Data Type
CT - 1                                                   8
          Data type - to ensure sanity for data.
          Boolean: Has only two values: True or False
              Operation: AND, OR
              Result Type: Boolean
          Integer: Range of values is -3,-2,-1,-,1,2,3,...
              Operation: + - x divide, <, >, = (comparison)
              Result Type: Integer, Boolean
              There are constraints on division. For result to be integer, first number has to
              be divisible by second.
              Or we change the definition to take only the quotient and drop the remainder
          Character: Values - Alphanumeric: A B ... Z, a b c... z, 0 1 ...9, Special
          characters: ., ; : * / & $ # @ ! ...
              Operation: = (equality)
              Result type: Boolean
         Subtypes of basic datatypes
CT - 1                                                                                           9
         We want to see whether we can restrict these data types to further narrow range
         of values and further constraints on the operations so that what we wanted for
         our datasets is implemented.
         Subtypes of integer:
         None of the integer operations make sense for the SeqNo data type.
         Another subtype is Marks
             Range of values is: 0-100
             Total: 0-300
             Operation: + - ( x and div doesn't make any sense) , < > = (comparison)
             Result type: Marks , Boolean
         Count:
             Range of values: 0, 1, 2, ...
             Operation: + -, < > = (x div doesn't make sense)
             Result: Count, Boolean
         Character Data Type and its subtype
CT - 1                                                                                     10
                 Values: A B .. Z a b .. z 0 1 .. 9
                 Special characters: .,: ;
                 Operation: =
                 Result type: Boolean
          Gender:
                 Values: M or F
                 Operation: =
                 Result: Boolean
          String Data type and its subtype
                 Values: Any sequence of characters, no restriction in the length of
                 characters.
                 Operation: char in string is equality (=)
                 Result type: Boolean
          Names
                 Values: strings with no special characters
          City
                 Values: Strings with no special characters
          Words
                 Values: Strings with alphanumeric and . , : ;
          Category
                 Values: Can take only one of the following values: Noun, Verb, Preposition,
                 Adjective, ...
         Transformation of sub-datatypes
          Its not obvious how to create a subtype.
          Integer
                 Dates:
CT - 1                                                                                         11
                What is date?
                4th June, 31st January
                Is that a string?
                Many operations on dates cannot be done if we think date as a string.
                Adding 2 dates.
                Operations on dates: print, <, >, =
                Multiplication doesn't make any sense.
                Nice to make date as integer
                Range of values: 0,1,2,3,...365 (this is a leap year)
                Non-leap year: 0,1,2,...364
                You can do integer operations now. Add, sub, compare.
                Result type: String, Boolean
            Marks
                What about fractional marks, ex, 62.5
                Round it off, a way to make things simple but we lose information.
                Operation: print, + - < > =
                Result: String, marks, boolean
         Dealing with fractional values
CT - 1                                                                                  12
           Amount
                Range: 0,1,2,3...
                Operation: print, + - < > =
                Result: String, amount, boolean
           Quantity
                Operation: print, + - < > =
                Result: String, quantity, boolean
         Non Graded
         1. 31 + 28 + 31 + 30 + 31 + 30 + 31 + 15 = 227, but we initialized it from 0, so
           answer is 226
         2. print(7050), 7050/100 = 70.5
         Introduction to complex datatypes like
         records and lists
           Boolean, Integers, and characters.
           Various subtypes
CT - 1                                                                                      13
         Strings and subtypes of strings.
         Records and lists - Data types packaged together in a bundle.
         Two ways of bundling
         Record (also called struct or tuple):
         One way of bundling is record.
         Sometimes in some programming languagues it is called Struct, in C, C++,
         or sometimes a class or it might be called as tuple in mathematics.
         Record basically is just a dataset or basically data item which has number of
         elements. These elements are called as fields.
         In Marks card: serial number, name, gender, DOB, tow/city - these are the
         fields of marks card.
         How do you represent the data type of this marks card?
         The valid values, the range of the valid values that the marks card can take
         cannot be independent of the values that the fields can take.
         So trying to define what are the fields in this particular data item and what
         are the values that fields can take.
         So, it is enough to basically write down all the fields and the data types of
         those fields.
CT - 1                                                                                   14
                Marks card is a record data type. Which contains fields.
                Each of these fields has a specific data type.
                So the values for each of these fields is constrained by the kind of data type
                that you are defining for that particular field.
                Sanity is ensured by defining the data types of the field.
                WordInPara
                A record with four fields.
         List
                Set of all marks cars, is there a data type that can basically, that can
                represent that?
                For one card, I can write a record. But for the entire list, or entire set of
                cards.
CT - 1                                                                                           15
         So, sequence of data elements is called a list.
         String has a sequence of characters so string is basically a list which
         contains list of characters.
         Marks card list is just a sequence of marks card record data types.
         A paragraph word list is just a sequence of WordInPara record data type.
         So, the set of all the marks cards, is a marks card list.
         Set of all the words in the paragraph is a paragraph word list and similarly
         the set of all shopping bills is the shopping bills list.
         What is the record data type of shopping list?
         I dont know how to define. Because I dont know how many elements are
         present in that item.
         Because it looks like there are sequence of elements there, it looks like a
         list.
CT - 1                                                                                  16
         What is it a list of?
         It has to be list of items.
         So if we take one item and make a record data type out of it, then it will be a
         list.
         What does one item look like?
         It has basically the item name, category name, quantity, and price.
CT - 1                                                                                     17
         Item > Itemlist
         In this way, I have basically created a record data type for shopping bills. So,
         note that the record data type for shopping bill contains a number of fields,
         one of those fields is a list, that list basically is a list of records,
         so shopping bill is basically a record of a list which contains a field which is
         the list of another record which is item, which in turn has a number of fields.
CT - 1                                                                                      18