Lecture #12
Time And Counters;
               Watchdog Timers
                           18-348 Embedded System Engineering
                                    Philip Koopman
                                   Monday, 22-Feb-2016
           Electrical &Computer
          ENGINEERING
© Copyright 2006-2016, Philip Koopman, All Rights Reserved
             https://www.youtube.com/watch?v=-5wpm-gesOY
Where Are We Now?
   Where we’ve been:
    • Part 1 of course – lots of general topics that you’ll need
    • DON’T FORGET TO:
         – Look at feedback from TAs on your labs
                   °
   Where we’re going today:
    • Time and counters – a bit more nitty-gritty
    • Look for using previous concepts (e.g., fixed point math)
                   °
   Where we’re going next:
    •   Test #1 in class Wednesday Feb 24, 2016
    •   Interrupts, concurrency, and scheduling
    •   Analog and other I/O
    •   Test #2 on Wednesday April 20, 2016
    •   Final project is more self-directed; a bit more time to work on it
Preview
   Time of day
    • Accuracy, drift
    • How computers really measure time
   Hardware timer operation
    • Setting up a timer, including frequency calculations
    • Converting a hardware timer to time of day
    • Classic timer mistakes
   Watchdog timer operation
    • How and why to use a watchdog timer
    • How not to use a watchdog timer
                                                                             4
How Do You Know What Time It Really Is?
   www.time.gov
   Other good sources:
    •   GPS
    •   NIST radio broadcast (WWV radio)
    •   Cell phone system
    •   Internet time servers
    • But you have to know what time zone you’re in!
         – (What about mobile systems?)                                                         5
Daylight Savings Time & Time Zones
   Daylight savings time switches on particular dates
    • Which are declared annually by Congress and have been known to change
         – WW II had war-time daylight savings time to save energy
         – “Energy Crisis” in the 70’s resulted in year-round daylight savings time
         – Only the Navajo nation within Arizona does DST (not the state; not the Hopi resv.)
    • http://www.energy.ca.gov/daylightsaving.html
         – Beginning in 2007, Daylight Saving Time extended:
         – 2 a.m. on the Second Sunday in March to
           2 a.m. on the First Sunday of November.
         – This does not correspond to European dates!
                                                 www.time.gov
                                                                                                6
F-22 Raptor Date Line Incident                                                         [Wikipedia]
   February 2007
    • A flight of six F-22 Raptor fighters attempts
      to deploy to Japan
    • $360 million per aircraft
        – (Perhaps $120M RE, rest is NRE)
    • Crossing the International Date Line, computers crash
        –   No navigation
        –   No communications
        –   No fuel management
        –   Almost everything gone!
        –   Escorted to Hawaii by tankers
        –   If bad weather, might have
            caused loss of aircraft                  [DoD]
    • Cause: “It was a computer glitch in the
      millions of lines of code, somebody made an
      error in a couple lines of the code and
      everything goes.”
                                                                                                     9
2013: NASA Declares End to Deep Impact Comet Mission
http://apod.nasa.gov/apod/
image/0505/art1_deepimpact.jpg
                                    http://news.nationalgeographic.com/news/2013/09/1309
                                    20-deep-impact-ends-comet-mission-nasa-jpl/                  10
                                                          Note: Unix epoch is
                                                          00:00:00 UTC on
                                                          1 January 1970.
                                                          ISO 8601 date format:
                                                          1970-01-01T00:00:00Z.
Problems With Time in the Real World
   Coordinated Universal Time (UTC; world time standard)
    • Is not a continuous function due to leap seconds
      (and is only monotonic by putting 61 seconds in a minute just before midnight)
    • Leap year also causes discontinuities, although they’re more predictable
   Time zones
    • Not just on hourly boundaries – Venezuela is UTC/GMT -4:30 hours; no DST
    • TV auto-time-set might sync to channel from wrong time zone via cable feed
   DST changeover date changes fairly often
    • With little warning compared to a 10-20 year embedded system lifetime
   “Y2K”
    • “99”  “00” on Jan 1, 2000 (there were many failures, but world did not end)
    • The GPS 1024 week time rollover (a ship got lost at sea…)
    • And Unix rollover problem (January 19, 2038 03:15:07 GMT)
                                                                                     12
Internationalization
   The Moral Of The Time Stories:
    • Keep time in GMT or UTC, not local time
    • Keeping time is tricky (rollover, time zones, etc.); kids don’t try this at home
    • And … it’s more than just time keeping
   What day is 02/03/16?
    • In the US: Feb 3, 2016
    • In Europe: 2 March 2016
    • Don’t forget: AM / PM vs. 24 hour clock
   Other internationalization issues:
    • English vs. Metric (F vs. C; ft vs. meters; speed limit in mph + distance in km)
    • Many, many complications on translation
        –   Singular v. plural; Gender
        –   Currency signs & conversion
        –   ASCII vs. 16-bit Unicode
        –   …
                                                                                         13
Time and Computers
   Computers are digital (and therefore discrete) devices
    • Can count up things (for example, seconds)
    • But, can’t actually represent exact analog values
   Time is an analog value
    • Time flows smoothly as far as we’re concerned, not in big chunks
    • How do we get from a smooth, continuous flow to a countable number?
   Basic source of timing information – the system clock
    • A clock provides discrete time chunks at some operating speed
    • Not only cycles the CPU registers, but also providing a basis for counting time!
    • Basis of time in computers is – no surprise – counting “clock” cycles
                                                                                         14
Physical Clock – What’s The Basis For Time?
   Typical source: oscillator circuit, perhaps augmented
    with GPS time signal
    •   R/C timing circuit; somewhat stable (e.g., 1% resistor gives a lot of drift!)
    •   Commodity crystal oscillator; perhaps 10-6 /sec stability (14-pin DIP size)
    •   Oven-controlled for wireless communications; perhaps 10-11 /sec stability
    •   Micro-rubidium atomic oscillator
         – perhaps 10-11 /month stability
         – 0.7 kg weight
         – 0.3 liter volume
                                                                                        15
Can You Run Faster Than The Oscillator?
   Course board uses a 4 MHz crystal oscillator
    • Divided to 2 MHz to get accurate 50% duty cycle
    • Specs from module documentation are: 4 MHz +/- 30 ppm
         – 30 ppm = 3 * 10-5 = 0.003% => +/- 120 Hz
   We want to run at 8 MHz
    • CPU can handle up to 25 MHz
    • Old modules ran at 8 MHz and this avoids many
      potential bugs in course software infrastructure
    • Any guesses as to why new module is at 2 MHz?
   Running faster than the oscillator:
    • Turn on the PLL (Phase Locked Loop)
    • Set PLL multiplier
    • Hardware automatically generates
      faster clock that tracks input oscillator edges
    • What is drift rate of this faster oscillator?                                     16
Simple Real-World Drift Example
   A gizmo has a crystal oscillator running at 32,768 Hz + 0.002%
     • 32,768 is a standard watch crystal frequency (15-bit divider gives you 1 Hz)
          – (.002% is a 2*10-5 drift rate)
     • The product specification requires accuracy of 2 seconds/day
     • Assume perfect software counting of oscillator clock cycles
     • Will the oscillator meet the specification?
(.00002 sec/sec drift rate * (60 sec * 60 min * 24 hr)
                             = 1.728 sec drift per day (so it meets the spec.)
     • How far will it drift over a 2-year battery life?
1.728 sec/day * (365.25 days * 2 years) = 21 minutes drift over 2 years
   Observations:
     • 10-6 or 10-7 is probably desirable for consumer products that keep time
          – Is the course computer good enough to be a clock?
     • There are a lot of seconds in a year (31.6 million ~= 225 of them)                      17
Counting The Clocks
   Time is an integer count of some number of clock “ticks”
     • One year @ 2 MHz takes about 47 bits to represent as an integer – too big to be
       useful for most embedded applications
     • But, most applications don’t need time to the nearest 1/2,000,000 second
     • So, we want time with a bigger granularity than this
   Thus, the concept of the timer
     • Increment a “timer” once every N CPU clocks (this is a clock “tick”)
          – Potentially, tell the CPU to update its software-maintained clock on every timer
            increment; maybe a 32-bit integer
     • Example: Original IBM PC updated time of day 18.2 times/second
          – Windows Forms timer is still that speed (55 msec)
     • Many Unix systems have base timers that run at 30 or 60 times/second
          – Why this frequency?
                                                                                               18
Course Chip Timing Support
                                   Ignore for today
                                                                             [Freescale]   19
Hardware Timer Operation
   “Channels” and “IOC” items are for pulse inputs/outputs
    • Not relevant to this lecture
   Prescaler
    • Divide system clock by an integer value as input to timer
        – System clock is 8 MHz for course HW; defaults to some other speed in simulator
    • PR[2:0] controls prescale amount
        – Divide bus clock by: 1, 2, 4, 8, 16, 32, 64, 128
   16-bit Counter -- TCNT
    • “up” counter – always increments
    • Clocked by prescaled bus – one increment every 1, 2, 4, 8, … , 128 bus clocks
                                                                             [Freescale]   20
Reading The Hardware Timer
      // set TN = 1   Timer Enable   TSCR1 bit 7
      TSCR1 |= 0x80;
      // set PR[2:0] Timer prescale in bottom 3 bits of TSCR2
      TSCR2 = (TSCR2 & 0xF8) | 0x04;   // 0x04 bus clock / 16
      for(;;) { timer_val = TCNT;
      } /* update timer_val forever */
                                                                           [Freescale]   21
How Can We Use This To Measure Time?
   Every time TCNT rolls over to zero, increment a software time counter
     • This is really inefficient!!! – but demonstrates the general idea
 int time_count = 0;
 // set TN = 1   Timer Enable   TSCR1 bit 7
 TSCR1 |= 0x80;
 // set PR[2:0] Timer prescale in bottom 3 bits of TSCR2
 TSCR2 = (TSCR2 & 0xF8) | 0x07;   // 0x07 bus clock / 128
 // Only works if loop is faster than timer increments!
for(;;)
 { // increment time_count whenever TCNT reaches zero
   if (TCNT == 0)
   { time_count++;
     while (TCNT == 0); /*wait for TCNT to change again*/
   }
 }
                                                                                         22
How Fast Does That Time Counter Increment?
   Analytically:
    • 8 MHz module
    • 16-bit counter rolls over every 65536 counts
    • Prescale by 128, so roll over happens 128 times slower
        time  65536 * 128 / 8,000,000  1.048576 seconds
    • (How long for 2 MHz Module?)
   Experimentally (via simulator set for 8 MHz):
    •   Set breakpoint at: time_count++;
    •   First breakpoint:            8,372,411
    •   Second breakpoint:          16,761,026
    •   Elapsed time:               8388615 clocks = 1.048577 seconds
         – (accurate within less than time to execute the loop testing for zero)
                                                                                            23
Accuracy
   What if we wanted to display time with seconds?
    • This hardware won’t make that easy!
    • Can’t get exactly 1 second tick values from hardware
    • Can do better by updating a lot more frequently than every second
   For example, to display time in seconds…
    • Find a divider value for which TCNT rolls over every 0.025 to 0.10 seconds
         – This is how the IBM PC got 0.055 second ticks – it was an “easy” divider value
    • Update a software counter on every TCNT rollover
    • Whenever that software counter exceeds 1 second of value,
      update the seconds count
    • This still won’t display exact seconds….
         – Accurate to within TCNT rollover period plus sampling jitter
         – But for a clock the human eye can only “see” about 0.05 to 0.1 seconds anyway
                                                                                            24
Design To Track Seconds
   Keep state machine to track rollover
    • Only needs to sample TCNT a few times per
      rollover to avoid missing one
    • Accuracy improved with sampling speed
    • Do other application stuff in both “TCNT high bit
      set” and “TCNT high bit clear” states
    • But how do we handle the rollover?
                                                                                  25
Design Example – Don’t Lose Fractions
   Assume bus clock divide by 64; 25 MHz board
    • 65536 * 64 / 25,000,000  TCNT rollover every 0.167772 seconds
    • (Need to sample TCNT every 0.08 seconds to catch the rollover event)
    • If we want to keep seconds, then increment seconds every 5 or 6 rollovers
   How do we track fractional seconds without floating point?
    • Answer: 16.16 fixed point! – a 32-bit fixed point integer
        – unsigned long current_time;
        – Top 16 bits are integer seconds
        – Bottom 16 bits are fractional seconds
          (each integer “count” = 1/65536 seconds = 0.00001525878906 seconds)
    • For each TCNT rollover, add 0.166772 / 0.00001525878906
                                        = 10930 fractional seconds
    • TCNT rollover becomes:        current_time += 10930;
    • Seconds are in:               (current_time >> 16) & 0xFFFF;
                                                                                  26
Time Accuracy Calculation
   An approximation makes life easy, but how far off is it?
   In 10,000 seconds, TCNT will roll over:
    • 10,000 * 25,000,000 / (65536 * 64) = 59,605 times
    • That’s 10930 fractional seconds added to the 32-bit time counter
      10930 * 59,605 = 651,482,650  $26D4 D61A
    • Top 16 bits are $26D5 (rounded)  9941 (instead of 10,000)
    • Accuracy is 9941/10,000  99.41%             (0.59% error due to timer interval)
    • How could we be better?
   Is this good enough?
    • Crystal Oscillator is 4 MHz +/- 0.003%, which is insignificant for this purpose
    • Error is: 0.59% * 31536000 seconds/year = 51.7 hours per year; 8.5
      minutes/day
    • NOTE: Our time counter rolls over every 64K seconds = 18.2 hours
        – What this really means is you want 32.32 fixed point time for longer operation
                                                                                           27
Why Are Timers Such A Big Deal?
   No more counting NOPs in loops
    • NOP-delay loops are a pain to build and get right
    • And they break every time you change the oscillator speed or CPU clocks/instr!
   Lets processor do other useful work while keeping time
    • Can check timer once in a while to see if top bit of TCNT rolled over
    • Combined with “interrupts” (next lecture), processor doesn’t have to check time
      periodically – is just notified on every rollover of TCNT
   Time values independent of software execution
    • Not sensitive to variations in instruction timing
    • Still works if software inside loop has multiple “if/else” paths…
      because it is not based on how long software takes to run
    • Still works at different clock speed (need to adjust the prescale value)
   BUT, it’s a bit of work getting accurate time-of-day values
    • Have to take into account exactly how often HW timer ticks and rolls over!
                                                                                           28
Classic Timer Mistakes – “Nanosecond” Time
    • [http://www.gnu.org/software/libc/manual/html_node/Elapsed-
      Time.html#Elapsed-Time]
    • — Data Type: struct timespec
      The struct timespec structure represents an elapsed time. It is
      declared in time.h and has the following members:
    • long int tv_sec
       – This represents the number of whole seconds of elapsed time.
    • long int tv_nsec
       – This is the rest of the elapsed time (a fraction of a second),
         represented as the number of nanoseconds. It is always less
         than one billion.
   This value reports time in nanoseconds
    • That means it is a number of nanoseconds
    • That does NOT mean it is the nearest nanosecond
    • The underlying hardware has a timer that only increments once in a while!
    • Classic mistake is to ignore quantization error in the timers
                                                                                  29
Classic Timer Mistakes – Non-Atomic Access
                                                        [Freescale]
   What happens if you use two 8-bit reads (LDAA) instead of LDD?
    • 16-bit fetch locks the value as it is being read; gives correct result
    • Timer hardware might increment between two byte-sized reads
    • AND, that increment might include a carry from low 8 to high 8 bits
    • $03FF  $0400 read hi then lo gives $03 … $00 => $0300!
    • This is an absolutely classic timer bug – don’t let it happen to you!!!
                                                                                  30
 Classic Timer Mistakes – Rollover
                                                                                                     http://www.nytimes.com/2015/05/01/business/faa-orders-
                                                                                                     fix-for-possible-power-loss-in-boeing-787.html
    Eventually integer timers roll over
       • Assume time kept in 100ths of a second as a signed 32-bit integer (wrong type!)
       • 0x7FFFFFF = 2147483647 / (24 * 60 * 60 * 100) = 248.55 days to overflow
       • (Note: unsigned int would roll over after 497 days)
http://rgl.faa.gov/Regulatory_and_Guidance_Library/rgad.nsf/0/584c7ee3b270fa3086257e38004d0f3e/$FILE/2015-09-07.pdf                                     31
                                                                                                                                                        32
Watchdog Timers – Detecting Software “Hangs”
   A common symptom of software problem – system hang
     •   Could be an infinite loop
     •   Could be continually chasing a “wild” pointer around
     •   Could be corrupted data
     •   … but often systems “lock up” or “hang”
   Good general-purpose remedy – reboot system if it hangs
     • But, there is no person around to press “ctl-alt-delete”
     • So, let the watchdog timer do it instead
     • BUT realize this doesn’t solve all problems
          – just some that are nice to address
   Basic watchdog idea:
     •   Have a hardware timer running all the time (count-down timer)
     •   When timer reaches zero, it resets the system
     •   Software periodically “kicks” (or “pets”) the watchdog, restarting the count
     •   If software has “kicked” the watchdog often enough, no reset takes place
                                                                                        33
Watchdog General Block Diagram
   System reset starts the watchdog initially
     • Clock is used to count-down the watchdog timer
     • Kick restarts the watchdog
     • Watchdog resets CPU when it reaches zero
         CLOCK
                                         KICK
                  WATCHDOG                         Microcontroller
                    TIMER               RESET
                                                        CPU
                                                                                        34
Course MCU Watchdog  “COP”
   See chapter 9 of data sheet – “Clocks and Reset Generator” (CRGV4)
    • COP = “Computer Operating Properly”  Freescale name for watchdog
                                                                 [Freescale]   35
When To Kick
   Kick periodically
    • Often enough to avoid reset
   Kick only when doing so means the
    system is really alive
    • Between major subroutine calls
    • Only in the main program loop
    • NEVER within individual task loops
        – Except if you are sure they will
          terminate (e.g., fixed integer loop
          bounds)
        – And even then, probably only in the
          main program loop
   These are basic rules
    • Advanced topic: with multitasking
      system, every task should participate in a
      consensus-based watchdog reset
      operation
                                                                               36
Watchdog Timer Select
   Set watchdog so that it is fast enough to catch problems quickly
    • But not so fast you miss it
    • Requires estimate of program execution speed between kicks
                                                                         [Freescale]   37
Petting The Watchdog (Kicking the COP)
      NOTE – multi-operation “kick” to reduce chance of random code kicking it
                                                                         [Freescale]   38
Bad Watchdog Use
   Kicking inside a single task loop
    • OK, so that loop is alive, but what about other tasks?
   Kicking in a great many places in the code
    • Only kick in the main loop; as few places as possible
    • What if you make a mistake and kick inside a loop?
   Hooking up a timer interrupt to kick the watchdog
    • Every time timer rolls over, kick the watchdog
    • Only proves the timer is working, not the main tasks!
    • (There are very special exceptions for multitasking)
   Watchdog can be defeated by software
    • HW should prevent watchdog turning off once on
    • HW should prevent masking/disabling the watchdog
      reset once enabled
    • Watchdog should require sequence of values to “kick”
    • Some systems forget to turn on watchdog                                       39
Watchdog Margin
   Let’s say you set the watchdog where you think it should be
    • You compute expected task execution time
    • In the lab, you never see a watchdog trip
        – Hopefully you don’t blame one on something else – make sure they are
          unmistakable!
    • In the field, the watchdog trips – what happened?
        – Well, obviously something you didn’t test
        – Maybe you set the watchdog too close to the edge!
   Testing watchdog margin
    • Change the watchdog divider until it trips
        – Does it trip where you expect? (If not, you don’t understand something)
    • Add some time-wasting nop-loops in your code
        – Does it trip where you expect? (If not, you don’t understand something)
                                                                                    40
Multi-Tasking Watchdog
   Consider a preemptive tasking system
    • (We’ll talk more about preemption later – we just mean “multi-tasking” here)
    • Assume there is a watchdog timer (a COP timer)
    • kick() restarts the watchdog time at initial value
    void   task0(void)         {    ..   Do   stuff..;   kick();      …more…     ;}
    void   task1(void)         {    ..   Do   stuff..;   kick();      …more…     ;}
    void   task2(void)         {    ..   Do   stuff..;   kick();      …more…     ;}
    void   task3(void)         {    ..   Do   stuff..;   kick();      …more…     ;}
    • What’s wrong with the above approach?
    • (Murphy00 supplemental reading also talks about this issue)
                                                                                           41
Effective Multi-Tasking Watchdog Approach
    void   task0(void)     {   ..   Do   stuff..;   Alive(0x1);     }
    void   task1(void)     {   ..   Do   stuff..;   Alive(0x2);     }
    void   task2(void)     {   ..   Do   stuff..;   Alive(0x4);     }
    void   task3(void)     {   ..   Do   stuff..;   Alive(0x8);     }
   Main idea – each task sets a bit indicating it has run
    • Separate watchdog monitor task kicks watchdog only when every task has reported in
    • Needs to be modified to account for task periods, but this is the basic idea
    static uint16 watch_flag = 0;
    void Alive(uint16 x)
    { SEI();               // disable interrupts
      watch_flag |= x;
      CLI();               // enable interrupts
    } // set task’s “I’m Alive” bit
    void taskw(void)    // run            periodically
    { if (watch_flag == 0x0F)             // if all tasks alive
       { kick();                          // kick watchdog
         watch_flag = 0;                  // erase flags
    }}
                                                                                           42
Review
   Time of day
    • Accuracy – time measurement and quantization
    • Drift – due to oscillator speed AND software inaccuracies
    • Converting a hardware timer to time of day
   Hardware timer operation
    • Setting up a timer, including frequency calculations
    • Classic timer mistakes
   Watchdog timer operation
    •   Setting up the watchdog, including frequency calculations
    •   How to ensure a watchdog timer is set properly
    •   Rules for good and bad watchdog use
    •   Multi-tasking watchdog
                                                                             43
Lab Skills
   Counter/timer
    • Be able to set, read, and generate time of day from a hardware timer
   Watchdog timer
    • Be able to set up and measure effects of watchdog timer
                                                                             44