Biomod
Biomod
Shawn Douglas
This book is for sale at http://leanpub.com/biomod
This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing
process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools
and many iterations to get reader feedback, pivot until you have the right book and build
traction once you do.
About . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Support
Translation and editing costs were paid by the BIOMOD Foundation with support from many
sponsors².
S.M.D. was supported by a Burroughs Wellcome Fund Career Award at the Scientific Interface,
the Pew-Stewart Scholars Program for Cancer Research, the Del E. Webb Foundation, and
National Science Foundation (CCF-1317694, CCF-1453847, SI2-1740282), Army Research Office
(W911NF-14-1-0507), Office of Naval Research (GRANT12390474), and the UCSF Program for
Breakthrough Biomedical Research.
¹http://cbi-society.org/home/documents/eBook/Biomod2016.pdf
²http://biomod.net/sponsors/
Chapter 1 Let’s go to BIOMOD
1. Welcome to the World of DNA Molecule Design
It is now possible to design biological macromolecules, typically DNA sequences, to create com-
plex nanodevices and nanostructures. This is a revolution that is comparable to the industrial
revolution but occurring at the molecular level. Molecule design will could into a gargantuan
field of technology comparable to silicon semiconductor technology developed in the 20th
century. Welcome to a world of molecular design that holds infinite potential.
computer. Stuctural DNA nanotechnology also tries to combine DNA nanostructures with other
molecules such as proteins and lipids (see section 49).
2. What is BIOMOD?
This section describes previous competitions of BIOMOD, its relation to Robocon Japan, and
how it all started.
Introduction
Welcome to BIOMOD. In a few months, you will be trying to create an unprecedented molecular
robot/device by trial and error, from the brainstorming of project ideas to the final presentation.
Before that, you should make sure you understand the logic and rules of the BIOMOD competi-
tion.
Background
Progress in molecular biology, organic chemistry, and other disciplines in the past several
decades has revealed a trove of information about biomolecules. In particular, we have gained
a great deal of information on nucleic acids (DNA and RNA) and related matters. The famous
discovery of the double helix structure by Watson and Crick was followed by the establishment
of enzymatic manipulation, and the introduction of novel functions by means of artificial
bases. Such progress in basic science and the resultant technologies together provide a basis
to develop novel fields in engineering such as modern biotechnology and nanotechnology.
Only decades ago, we considered the rules of biological phenomena to be ungovernable using
Chapter 1 Let’s go to BIOMOD 6
Originally, BIOMOD planned to borrow the entire competition framework of iGEM (International
Genetically Engineered Machine competition) [5], a student competition in synthetic biology.
The idea was that incorporating the similarly standardized methodology of DNA origami and
CAD (for easy design of DNA origami) into the framework of iGEM and applying it to BIOMOD
should result in similarly rapid progress in the competition, and thus in the field of molecular
robotics.
However, Japanese researchers thought that the judging system of iGEM was not fully transpar-
ent, and, in terms of both scientific progress and education, they wanted a more transparent and
more equitable judging system that would still be suitable for student participants and teachers.
As a result, we proposed the introduction of as fair judging criteria as possible, and a Robocon-
like pilot competition and Molecular Robotics Award, which in the long term could turn into a
more Robocon-like competition. Our proposal was to introduce the essence of the open and fair
judging system of Robocon into the framework of iGEM. Our proposal combined advantages of
two competitions of totally different origins and most of the ideas were adopted. However, the
proposed judging system based on point scoring is still utilized essentially in its original form
in BIOMOD. After the Robocon-like pilot competition and Molecular Robotics Award at the 1st
competition in 2011, we concluded that it was too early to have a Robocon-like competition as
part of BIOMOD. The award remains as a special award given to the project contributed the most
to molecular robotics among the projects of all participating teams.
—Shogo Hamada (Cornell University)
- Silver: Satisfied criteria for Bronze, plus at least one device (part of the system) in the team’s
design worked as expected.
- Gold: Satisfied criteria for Silver, and overall point score from Website + YouTube + Jamboree
presentation are in top 50% of all teams.
Most of the awards celebrate top three teams. The Website, YouTube, and presentation scores
are independent of each other. As of March 2016, the Tohoku University team is the only
Japanese team to won first place in any of the major categories (YouTube Award, Presentation
Award, and Grand Prize in 2012, and YouTube Award and Grand Prize in 2015). In the Website
category, the best Japanese team was the Tohoku University team in 2014, which won 2nd place.
The Audience Choice Award, not included in overall score ranking, is given to the best presen-
tation determined by vote of student participants. Usually, the Audience Choice Award winners
are the same as the Best Presentation Award winners.
There are two more awards, each of which is given to only one team: the Best T-shirt Award
(TEAM TITECH won this in 2013) based on the vote of participants and MOLBOT Award supported
by Molecular Robotics Research Group from Japan (see the column for details).
About Judges
For each category, the judges consist of mentors of participating teams and invited experts.
Judging is done on an anonymous basis, and each mentor is not allowed to evaluate her/his
team. For each team, online content evaluation (Website and YouTube) is allotted to five
randomly chosen judges. Live presentation evaluation is done by all judges present. Highest
and lowest judges’ scores are excluded, and the three remaining judges’ scores are averaged.
in one summer? (4) Merit (worth up to 5 points): Is the proposed solution a good one? Is it
particularly elegant or innovative?
- Project Documentation. (1) Clarity (worth up to 10 points): Is the project description well-
written and easy to understand? Does it include the necessary background information and
motivation for the project, methods, results, and discussion? Are the figures easy to understand?
(2) Transparency (worth up to 5 points): Are all of the raw experimental data and source files
easily accessible? Would it be straightforward to attempt to reproduce the team’s results? (3)
Layout (worth up to 5 points): Is the team’s project page arranged in a clear and logical fashion?
- Execution: Did the team accomplish what they set out to do?
MOLBOT Award
The MOLBOT Award, sponsored by Molecular Robotics Research Group from Japan, is given
to the project that most contributed to molecular robotics in the BIOMOD competition of the
year. The winner is determined by the judges, based only on the live presentation score. Thus,
in order to win this award, you must emphasize in the presentation how you tried to assemble
Chapter 1 Let’s go to BIOMOD 10
a molecular robot.
Japanese teams have been dominating in this category. In the 1st and 3rd competitions,
Japanese teams won this award. In 2014, the Columbia University team won the award, a non-
Japanese winner for the first time. Competition for this award will become more and more
intense. However, because molecular robot projects are of high difficulty and thus not much
attempted by the students, a project with a good concept can compensate for less than prefect
results. I hope the readers of this book try for this award.
Day 1: Registration
Most teams will fly into San Francisco International Airport (SFO). Japanese teams can take a
direct flight from Narita. To get from the airport to hotel and/or registration, car service apps
such as Lyft or Uber can be used from anywhere in San Francisco and surrounding areas.
Download the app on your smart phone and set up an account. Both apps use your smart
phone’s GPS to detect your location and connect you with the closest available drivers. You
will receive a text message with a photograph of your background-checked driver and their car,
you’ll be able to track their approach, and get a text message when they arrive. Payment can
be split among passengers and will be securely charged to the credit card on file. No cash will
change hands.
At the registration desk in the hotel lobby, you will get a registration packet with a name badge,
and a BIOMOD original T-shirt. If you arrive after 20:00, you can register at the competition venue
on the morning of the Jamboree.
Chapter 1 Let’s go to BIOMOD 11
On Day 1, conference rooms in the hotel are usually available for practicing presentations. This
is a good idea, but you have reserve them. Rehearsing is crucial to success on Day 2. Some of
the teams keep rehearsing until midnight. Sometimes teams are still rehearsing on the day of
presentation, but generally this didn’t seem to result in a good presentation. Make everything
ready by the day of presentation.
A Day of 25 Hours
Daylight Saving Time ends in the first week in November, often around the same time when
BIOMOD is held. This means that clocks are set back one hour at 2:00 a.m. There is no special
ceremony such as bell ringing, because it is midnight. Nevertheless, it is surely a special night
that you cannot experience in Japan or other countries.
After the Jamboree is over, you are free until the return flight. Let’s stroll around UCSF and San
Francisco. Some sites of interest are listed in the column.
evaluate what your team achieved. If there is no such figure, they will start looking for what has
not been achieved.
Since Execution and Feasibility are each worth up to 15 points and Relevance and Merit are
each worth up to only 5 points. Therefore, a subject matter that is fairly interesting (Relevance ₃
points₎ but has high feasibility ₍12 points; thus high Execution points) will result in a better total
score. Therefore, it is better to assemble an achievable unique structure, instead of trying for
medical and industrial products that are not easily achievable (and not unique) and thus are
likely to lose points.
If you are unable to memorize the script for the presentation in any way, do not worry. You
can perform a puppet show while a narrator reads the script behind the curtain. You can also
have one of the members show you cue cards from an audience seat, or you can have the script
written on your hand or accessory (fan, scroll, or the like).
Chapter 1 Let’s go to BIOMOD 16
6. BIOMOD Calendar
This chapter describes a standard working schedule and team organization. Because BIOMOD
is a team competition with a defined deadline, choice of team organization is very important.
Appropriate team organization will depend on your team members and on support from labo-
ratories.
The team starts in April. The team can be put together in various ways: with friends, with
members of a single laboratory, etc. The BIOMOD website (HP) recommends no more than 10
team members. Somewhere between 4 and 10 seems appropriate. Some say that the team
members should have different backgrounds and others like a group with similar backgrounds.
Either seems fine. Having different backgrounds will introduce a variety of ideas and skills into
the team, but runs the risk of causing the project to be less focused. Similar backgrounds will
Chapter 1 Let’s go to BIOMOD 17
facilitate mutual understanding, but may not be good for the brainstorming of project ideas.
Good team organization is key.
Once the team is agreed on, you need to go through lectures and textbooks and literature
reviews to find project ideas. The University of Tokyo Kashiwa team assigned relevant chapters
of the textbooks “Molecular Biology of the Cell” and “DNA Nanoengineering” to team members
in a journal club format. Note that too much textbook learning will take away the time and
energy for the brainstorming of project ideas, and may make it feel like another regular class,
resulting in a passive attitude of the members. The Kashiwa team also introduced a literature
review in a journal format, assigning about 2 papers per student. Based on such knowledge, you
will start brainstorming project ideas. This brainstorming is an important factor for the success
of the project, and is described in another section.
When summer vacation comes, the experiment starts. The experiment can be carried out by all
of the team members or only by the experimenter(s), depending on the team. In the case of the
University of Tokyo Kashiwa team, every member worked on the experiment at least 3 days a
week during the summer vacation (except during Obon or other holidays). Many of the members
worked for much longer hours.
A month after the beginning of the summer vacation, there is a trial in Japan. It is held at the
Hongo campus of Tokyo University in the beginning of September every year. At this domestic
trial, the primary purpose is to determine the direction of many aspects of the team, so the
project does not need to be complete at this point. Nevertheless, because you will be judged
in the same format as the final, you have to make some preparations for it. The presentation
must be in English, so you will have to do exercises in English in addition to presentation. The
result of the domestic trial does not necessarily correlate with the result at the final. Many teams
were motivated by their defeat at the trial and had a good result at the final. If your result at the
domestic trial is not good, just cheer up and work hard again toward the final.
After the summer vacation passes and Autumn comes, the team will have to push toward the
final. There are the experiments to be done, the presentations to be prepared, and the Website
and YouTube to be made. You must appropriately assign tasks to the members. All of these steps
Chapter 1 Let’s go to BIOMOD 18
Team Organization
As in a club or a part-time job, the most important position is the leader. However, the type
of leader required will depend on the team members so there is no single universal solution.
Usually, anyone who wants to be the leader should be. In addition to this general leader, the
University of Tokyo Kashiwa team also included sub-leaders such as a principal experimenter.
This was because the team leader, who oversees the entire project, might not to be able to make
an objective decision due to being too concerned about the experiment if she/he were concur-
rently the leader of the experiment. Because BIOMOD is a short-term battle, it is necessary to
balance ideals with reality and short-term considerations with long-term considerations. The
leader should be able to consider everything and make decisions when necessary. The principal
experimenter is responsible for the execution of the experiment, and should communicate with
the leader and the mentors and TA to decide the experimental plan and the goal, and share
information with other experimenters to direct the experiment. During this process, data should
be filed and processed whenever it is appropriate. In the battle against the deadline, there is a
great difference between preparing a Website and its figures from scratch and preparing it from
figures already finished or half-finished for the presentation slides.
Before Meeting
Because you and the members have separate brains, you need meetings. Find a place for both
meeting and working where the team members can come together anytime and even make
noise. You can use club rooms in the University or meeting room rentals, but you may have
extra troubles in the final stage before the deadline if the meeting spot is located far from the
Chapter 1 Let’s go to BIOMOD 20
advisers. In Tohoku University, it should be in a student room. I never recommend a house of any
member for these reasons: The place should be equipped with a fast Internet connection and
a whiteboard, to which a hard copy can be also attached with a magnet. Digitizing technology
such as electronic whiteboards is attractive but still expensive and slow in response, and also
not good at distinguishing texts from figures. Both handouts and whiteboard content can be
conveniently recorded by cellular phone → digitized → shared. The frequency of meeting at
Tohoku University was once a week usually and twice a week or on demand from Autumn to the
deadline. Put the calendar in the meeting room so that everyone can see it, with the schedules of
the different tasks aligned with each other from the start to November, which facilitates sharing
schedules. Count down by the week will make people aware of the deadline. Also write in the
schedules of the advisers.
What to do at meetings
Check the entire schedule/current problems/process steps/task assignment, share information,
and discuss possible solutions to problems. Brainstorming of project ideas may seem to be the
most important task, but will be discussed later.
Meeting Organization
- Chaired by the team leader. Does not need to be a funny comedian.
- A record keeper is a must. Let the proceedings be read by everyone just after the meeting.
- Distribute handouts. One-way traffic information through slides is insufficient.
- Bring a list of the problems to be discussed.
- Any problem unsolved at the meeting should be assigned to any of the members, with a defined
deadline.
- Thank the observer(s).
- Do not get personal.
- Do not repeat the discussion after the meeting. If you have to, do it at the next meeting.
- Time >>>> (Impassable barrier) >>>> Money
Regularly holding meetings can help you identify problems, and thus are good for the mental
and physical health of the members.
pleased to be outwitted.
Share Your Information with Other Members
If you don’t share information among the team members, it can result in a tragedy such
as a plurality of members happening to do the same task separately. Share the address of
team members to send/receive a meeting schedule/room number, sick note, etc. Personal
information management is important.
- E-mail: Be polite. Avoid sending a file to the mailing list (share a link to cloud storage).
- Twitter: Fast. Stable. For announcements. Not suitable for discussion.
- Facebook: Pretty fast. Communication before/after journal club or meeting. Not suitable for
storing experimental data.
- Texting / Line: Fast. Stable. For announcements.
- Skype and other TV meeting systems: Basically for information sharing between two people.
Check also prepared data and discussion. Seems to allow a conclusion faster than telephone.
Facial effects?
- Google Docs: Fast. Fairly stable. Because it allows simultaneous editing, you and other mem-
bers can edit the proceedings over a Skype chat to save time. Also this can be run on a projector
for brainstorming of project ideas.
- Evernote: Powerful note-taking tool. Fast. Group sharing is a little bit slow (as of March 2015).
- Face-to-face discussion: Super effective. Courtesy should exist even among intimate friends.
Prior appointment and punctuality essential for persons who are busy or whom you meet for
the first time.
- Telephone: Fast. Stable. A double-edged sword that is convenient for the caller but takes away
the time of the other once answered. Do not hesitate to use in emergency.
Conclusion
At BIOMOD, your project will be evaluated by other people. In order to get as good an evaluation
as possible, you should have as many people as possible comment on the project beforehand.
Chapter 1 Let’s go to BIOMOD 22
You must decide which comments to listen to and which to discard. No questions asked at the
competition should be new to you. Always prepare for the discussion as a team.
— Shinichiro M. Nomura (Tohoku University)
Importance of Website
A good Website is key to success, because it accounts for up to 50 points of a perfect total score
of 100 points. In order to win the competition, it is important to make the team Website clear
and interesting.
Judging criteria for Website
- Project Idea (worth up to 20 points): Relevance, feasibility, etc.
- Project Documentation (worth up to 20 points): Layout, clarity, etc.
- Project Execution (worth up to 10 points): Did the team accomplish what they set out to do?
Design of Website
The Website should consist of more than one page for reasons of clarity. A standard construction
is as follows.
1. Top Page: Usually this includes figures and YouTube so that the overview and results of
the project can be understood at a glance. Some teams only describe the project overview
here.
2. Project Overview: Briefly describe the background and goal of the project, using a fig-
ure(s). In particular, clearly describe “the necessity of the use of a DNA nanostructure for
the purpose” to justify the project.
3. Experiments/Calculation: Arrange figures into a clear story, not just a list of the results. If
you have too many figures to show, you can include less important figures in a supple-
mentary information page.
4. Discussion: In most cases, the project will not proceed as expected. Describe the results
and unsolved problems of the project clearly so that it is useful for future BIOMODers and
researchers interested in the BIOMOD pages.
Chapter 1 Let’s go to BIOMOD 23
5. Protocols: Like paper writing, the project will be forgotten if others cannot reproduce your
results based on the Website. Describe your protocols in as clear and detailed a manner
as possible. The underclass students will enjoy learning from it in the next year.
6. About the Team: Introduce your team members. Also include acknowledgements and
links to the sponsors who support multiple aspects of BIOMOD.
Technically, fixing the format (size, font, etc.) of the figures to be uploaded on Website and them
filing them will facilitate Website writing. Figure titles, legends, and other considerations should
be attached to the figures, so that a person who processes/uploads the figures will not have to
ask for the information later . Another hurdle in Website writing is English. In order to make
a good Website in English, read papers to learn technical terms/phrases, and let your draft be
reviewed by the mentor, TA, etc.
— Hisashi Tadakuma (Kyoto University)
Story
However beautiful your graphics are, the most important thing in your video is the story. A good
video must be based on a story that clearly describes the project and at the same time has an
Chapter 1 Let’s go to BIOMOD 25
impact.
This piece is dependent on how well you brainstormed project ideas to decide the subject
matter. Always keep in mind whether a given idea will turn into an interesting video (or how
the idea can turn into a story with an impact).
Planning
Empirically, many teams are panicked and start making the video when the last month comes.
You should avoid this if possible. I recommend that each team have a director. The director
should watch and study previous BIOMOD videos, and practice camerawork, video editing, and
the like by herself/himself. If the director is also an audio engineer and CG engineer, she/he
should practice these tasks, too. Filming a documentary of a team meeting and editing it will be
an effective practice. After the domestic trial (or alternatively some modification of the project),
you will have picked the specific subject matter of the project. Ideally, the director should have
acquired enough know-how to start shooting the video by this time.
Before shooting, you should check the schedule. After you begin shooting, you should collect
80% of the required shots in the first 20% of the remaining days. In the last lap, you will be often
occupied by the experiment or Website writing. Finishing predetermined parts of the video as
soon as possible will help out later. The entire process can be viewed as follows.
Video Timeline
Scenario
To start with, decide exactly what kind of flavor and story the video should have. The scenario
is important, which you have to fully work out in advance to make the entire process of video
production go smoothly. Needless to say, the scenario also affects the description of your
project in Website and Presentation categories, so that it should consist of a story acceptable
to all members and a clear description. You can run a scenario competition to choose one of
those written by the team members. Alternatively, you can borrow a standard genre format as a
sample form to write your scenario, though it is not good for your originality. In order to engage
the audience within just 3 minutes, you can adapt styles of well-known video games, movies,
and TV dramas to introduce your story. (Anyway, enjoy this process.)
Chapter 1 Let’s go to BIOMOD 26
At this point, the director should, if necessary, correct the scenario in terms of the impact of
the story, clarity of the description, and even the feasibility of the story, based on the know-
how obtained so far. If the description part cannot include many CG shots, look for alternative
means, such as reduction of the shots. You can also write individual lines for specific actors
(“Ategaki” in Japanese), so that these amateurs can act easily. Location shooting outside the
University will usually take too much time. You can write the scenario so that it will take as
little time as possible. A green/blue screen is convenient, but note that it takes much time in
the postproduction step.
Once the casting has been set, gather the actors in one place to rehearse the voice recording at
least one time. In particular, if the actors are to talk in English without any subtitles, every actor
has to do a thorough rehearsal in advance. Here, the director should watch the actors speaking
carefully. If there is anyone who cannot talk in English in any way, all the lines can be displayed
as English subtitles. In this case, you need English subtitles translated from the original script
written in Japanese. (Each subtitle should be of a length that can deliver the meaning of the
subtitle at a glance.)
Storyboard
Once the scenario has been written, the entire story should be somewhat clear. The director has
one more step to prepare for the shooting: the storyboard. This places individual shots into the
scenario (storyboarding), based on which specific camera techniques, SE, and BGM are assigned
to the shots. Once the shots are determined, get a stopwatch and count the individual shot
lengths, with their lines and the acting imagined in your head, to adjust these and other factors
including staging. If any subtitles are to be used, balance the lengths with those of the shots and
lines at this step. (Actually, some directors skip the latter steps and start shooting at once. Since
the total length of the video and the deadline are predetermined and tight, these steps will help
you imagine the actual shots and avoid retakes at the filming site, especially if you are new to
filming.) At this point, you should be able to imagine the entire video in your head and to share
the images with the actors and film crews. If there are any props required, manufacture them
during this step. If you have a CG engineer and/or a sound engineer, let them start their job at
this point. Confirm that the lengths of the shots do not exceed the total film length and that the
shots balance with each other in length.
Filming
A rule for filming is to do it at one sitting, to film all required shots in as short a period of time
as possible. First of all, decide the order in which the shots in the storyboard will be filmed.
Because you already have details of each shot in the storyboard, you don’t need to start from
the first page of the storyboard. Basically, you will film the shots as you have imagined them from
the storyboard. Of course, you can adopt a good shot that happens accidentally. Easy shots that
can be filmed independently of others, such as landscape shots, may be assigned to different
members.
Chapter 1 Let’s go to BIOMOD 27
Conclusion
This section has summarized the basic process for making a BIOMOD video, starting from the
importance of the video and along the actual video production process. Of course you don’t
necessarily have to go through all of these process steps, but with such a process always in your
mind you can make a better video. The judges and the audience are the same as you_—humans.
Thus, your video should be amusing and interesting to yourselves too.
— Shogo Hamada (Cornell University)
Staging in Presentation
Presenting at BIOMOD is very different from presenting at scientific meetings in general. At
scientific meetings, you will only have to describe the content of your study and its value as
clearly as possible. BIOMOD, as a student competition, allows you to present more freely in an
original way, and originality is also evaluated. In fact, the judging criteria include how impressive
and interesting your presentation is (see section 3). Therefore, staging of your presentation is
important.
Staging Techniques
- Dramatization: Incorporation of some sort of fictional drama.
- Dialogue: Cross-talk of more than one member.
- Cosplay: Use of a puppet, Kigurumi (character costume), or the like (in the middle of the
presentation).
- Visual effects: Use of a video projector.
Chapter 1 Let’s go to BIOMOD 29
although it is not included in the list of judging criteria. Non-native English speakers like
Japanese students are greatly handicapped. You might not hear the question well, and, if you
were able to understand it, you might not know how to answer. The team members on the stage
look at each other’s faces, standing still in an awkward silence… which is the worst situation of
all. Below is the list of what you can do in advance to avoid this. At the domestic trial held around
September, you are also required to do the presentation and discussion in English, where you
can get a hint of the atmosphere at the final.
On the way to BIOMOD, I acquired skills too. BIOMOD forced me to acquire basic
skills in laboratory experiments, PC skills to handle Website and use Illustrator, and
other skills. For me, however, the greatest thing about it was that it taught me the
importance of English.
At the final, I was shocked that my English skills were so awful compared to the
presenters of foreign teams and to other Japanese teams. BIOMOD was a miniature
of the world to me, and taught me that English was essential to work in an interna-
tional field. I want to improve my English skills not only in terms of writing/reading
but also communication, to be good at using English as a tool.
Chapter 1 Let’s go to BIOMOD 32
In April of sophomore year, I felt somewhat unsatisfied with my university life. I had
been thinking that a university student should actively study what she/he wanted at
her/his own will, but I was always occupied with tasks given by others. That was why
I was moved by the passion of the upper class students for BIOMOD and decided to
join the team.
In the brainstorming of project ideas, I often spent a day in the library looking for
something useful in the literature. After the project was set up, often the experiment
didn’t work well for many days and we had to discuss the possible cause for
several hours. At BIOMOD Japan, we had to make a presentation using slides and
a script that had been just prepared that morning. During the presentation I felt so
desperate and nervous that my hands started shaking and I was not able to smile
naturally. When we won Grand Prize, I was so happy that I could not stop crying.
The days before the final passed quickly. The place of BIOMOD, Harvard University,
had a large, relaxing campus surrounded by greenery. This allowed everyone of
us to carry out her/his role in the presentation, which was based on the motto of
the team “Involvement of all members,” without any pressure. I don’t know if it
was because we kept on improving both Website and YouTube until just before the
deadline, but we won more than one award, which I’m really proud of. The lobster
served at the night of the Awards Ceremony really tasted good.
I owe all the results and pleasure to the advice and cooperation of the mentors.
At the same time, I’m very grateful to the support from my family and friends, and
hard work of the team members. In order to make it better and better, I spent a very
enjoyable, fruitful 6 months discussing, colliding, studying, and laughing a lot with
the team members.
Hard work with the team members toward this one goal made me very happy,
making it a lifetime memory. To devote one summer to BIOMOD will certainly
reward you in the end.
amateur, one came to make a high-level video, another made a molecular dynamics
program, etc. None of these students seemed to have such talents when we first
met, and even a few months after the project started. Probably they were not
aware of it either. They happened to participate in BIOMOD, were forced to devote
themselves to it, and improved to a great extent.
Some of the students were not able to contribute much to the BIOMOD project, but
were kind of awakened after BIOMOD. For example, one of the students improved in
English skills from around 700 points of TOEIC to a nearly perfect score in two years,
and even mastered French by the junior grade. More than one student obtained
experimental results enough to write a paper in one year. I have worked in more
than 5 laboratories, but I have never encountered such a group of students who
had so many hidden talents or were awakened at such a high rate.
You will get not only skills/abilities/ambitions, but also a friendship with other mem-
bers through the hard work during Summer. BIOMOD is tough. Before Summer, you
might not find a good project. During Summer, the experiment might not proceed
as you have expected. Your Website and YouTube might still be unfinished when
Autumn comes. Criticisms from the mentors will often make you feel down. You
are sure to feel nervous before the deadline and the presentation, and might be
frustrated with a member who is not cooperative. The leader might sometimes
request too much from you. After you have got over all these troubles, however,
the team members will become comrades in arms.
Such skills, awakening, and friendships won’t come for free. You may get these
things after you have really devoted yourself to the project, working late, forgetting
about lunch until you have done everything you can. At a meeting after a crushing
defeat at the domestic trial, one of the students, who had always seemed to be a
happy-go-lucky one, groaned out: “I want to win. We have to change ourselves.”
Through these moments of passionate determination, the students really start
improving through BIOMOD.
Brainstorming
Once these priorities have been agreed on, you can start to narrow down the specifics of the
project. It can be any project that relates to the designing of biomolecules. You may not exactly
know what it is like to design a biomolecule, so you should consult this book, YouTube videos of
previous participants, and discuss the question with upper-class students and teachers. Then
you need to discuss your possible ideas with other team members to generate more ideas and
refine the ideas that come up–brainstorming. There are different types of brainstorming (see
section: Types of Brainstorming).
Chapter 2 Before Starting Your Project 36
You should be able to summarize these three points clearly and simply. For example, a good
idea might be summarized as:
However, the last one might be difficult for you to judge. Again, you can consult the mentors.
Copyright
Your YouTube video and project Wiki for BIOMOD will be made public online. You must make
sure that the figures, videos, and sounds included are either original or in the public domain.
Any publication that you consulted should be cited with appropriate information (authors,
paper title, journal, year, and so on). Respect for the work of others is very important in the
academic world too.
Brainstorming
Brainstorming is a techniques to find new ideas and perspectives through divergent thinking.
More than one member gathers to have a free, extensive discussion on a given subject matter.
It is the most orthodox process for generating creative ideas, and can be conducted in a very
straightforward manner.
First, decide together on a concrete topic. For example, the subject matter “how to make every
worker wear a hard hat” is better than “how to reduce construction site accidents”. Then,
individual members write ideas relating to the subject matter on a large sheet of paper or
whiteboard, and all discuss them one by one. The number of members and the length of the
whole process are not particularly limited, but several to 10 members and about 1 hour work
well. If you want go it for longer, the group should take a break about every 1 hour to refresh
yourselves.
In brainstorming, you can present essentially any idea but you should note the following four
points:
1. Welcome wild ideas: Free yourself from common sense and good sense to present any
idea. Never mind if it is strange or unusual.
2. Withhold criticism: Evaluation of proposed ideas should be put on hold. Do not criticize
other members’ ideas or your own, however ridiculous or uninteresting they seem.
3. Go for quantity: Instead of looking for the finest ideas, present as many ideas as possible.
4. Combine and improve ideas: Active utilization of existing ideas is also important. Mod-
ifying and combining different ideas can effectively generate new advanced ideas. As
described in (2), do not criticize ideas of others unless it improves the ideas and as a
consequence generates new ideas.
condition such as “they are to be transformed”, “ they are for seniors”, etc.
- Example of analogy: Choose “novel scissors” as the subject matter. Think “The role
of scissors is to cut” -> “Cutting is the purpose of guillotines, too” -> “Let’s make novel
scissors based on guillotines”.
3. Different perspectives: Think about different “products”, “properties”, “functions”, “com-
binations”, and so on, from “bird’s eye view/bug’s eye view” or the like [2].
Brainwriting
“Brainwriting” is similar to brainstorming [3]. It is effective in generating many ideas in a short
time, generating ideas in a group of rather reticent people, etc. Below is an example of the
process, applied to six people.
7. Every time you receive a worksheet, add new ideas inspired by the ideas in it (you can develop
ideas of the previous participant or propose totally different ideas inspired by them).
Repeat the steps 6 times in total. This takes 30 minutes or less, and you will get as many as 108
ideas (= 6 members x 3 ideas x 6 rounds).
Mind Map
In this technique, a list of ideas inspired by the subject matter is generated by drawing pictures.
KJ Method
The 3 techniques described above are for divergent idea generation. Sometimes you may have
to sort the ideas generated. One technique to sort ideas is the KJ method [4], which can be
summarized as follows.
Fiat Specification
As described in section 12, the logical basis of the project is important. If the logic is clear
enough, the functions and properties of the desired product will be self-evident. Specification is
making a list of the functions and properties the project is aiming for. The design process starts
with specification.
The difficulty of a design will depend on the specification. A specification that includes conflict-
ing requirements (e.g., a car with good speed and high fuel efficiency) is not feasible (i.e. has
no design solution). However, an easily achievable specification, e.g. one that doesn’t need an
elaborate design may result in an ordinary product.
Design process
At each step, check the following points. Check points of each step:
1. Is specification the right level of difficulty—not too easy and not too hard.
2. Choice of parts: Are they readily available, e.g. mass produced (see section 73). Avoid too
many parts (different molecules). Cost should be appropriate (see section 73).
3. Design: Avoid any non-practical assumption or wishful thinking.
4. Experiment: Simulation model possible? (see section 62). Experimentable? (see sections
69 and 70).
5. Evaluation: Reproducible? Accurate? (see section 70)
In principle, the design should proceed from the whole to details, from abstractness to concrete-
ness.
1. Conceptual design: Specify the required functions based on the overall image.
2. Preliminary design: Specify the structure and shape to achieve each function.
3. Detailed design: Specify the size and material of each part.
The process of dividing a system into smaller concrete parts is called decomposition, and the
process of combining the resultant parts back is called integration. Successful design requires
going from decomposition to integration and vice versa to specify details of the system. Remem-
ber this important constraint: one function per part. Any part with more than one function will
hinder design changes and error corrections. Such “optimization” should be done, if any, after
the basic design has been established.
Chapter 2 Before Starting Your Project 42
- Stamina is the patience to pursue optimization. If you have made a system but are still anxious
about possible hidden problems in it, and if it seems more and more incomplete to you as you
keep working with it, then you surely have a talent for handicraft.
You are going to combine all these abilities to design a molecular system. There is no easy
way out, and remember that you should not make the schedule too tight. Otherwise you can’t
examine the design from various perspectives, and you can’t deal with troubles. Good schedule
management is a prerequisite for a good design (see sections 6 and 7). In most cases, it is no use
making a final spurt just before the deadline in order to solve any technical problems. You will
have nothing to learn from “Project X”-like success stories.
MIRACLES
One more thing is important to designing. Only one MIRACLE at a time. MIRACLE means
an unknown element required to attain the desired function, which is not described in any
textbook or paper and thus has to be newly developed. A project without any MIRACLE tends
to be just a combination of previously reported results, and thus not very exciting. On the
contrary, a project with more than one MIRACLE will not be achievable in 6 months and thus is
unsuitable for BIOMOD.
Literature Search
You will usually use the Internet for a literature search. There are several literature search sites.
One that is accessible from anywhere is Google Scholar [2]. For example, when the keywords
“DNA origami” (see section 18) are entered in the search window, the search results are like the
figure below. You can look at these titles and abstracts to find papers of interest.
Google Scholar
Electronic Journals
In recent years, most papers are also published electronically on the Internet, so that you
don’t have to go to the library to read them. Papers are sometimes open access (free access),
but most charge a fee. However, you can access any paper from the university’s network and
read/download it for free, as long as your university is under contract with the journal. As
mentioned above, a paper is still cpyrighted, and thus should not be modified or redistributed
without permission. In addition, massive download in a short time might result in university-
wide suspension of service.
appropriate. Then read through the paper once. You don’t need to understand everything in it
on the first read. Read it again, referring to textbooks and references to address any confusion.
It will be fun to discuss the paper with your BIOMOD team members and upper-class students.
Sections of a Paper
Read as many papers as you can. Practice makes perfect. In doing so, you should also record
these papers in a file of reference management software or any other tools, which will help you
make create the references for the Wiki (see section 8). You can try Mendeley, ReadCube, JabRef,
and various other tools. Please visit the web site for details.
—Fumi Takabatake (Tohoku University)
Introduction
Scientific research consists of an continuous cycle of investigating a phenomenon, summa-
rizing the results, and disseminating the information to the community. Through this cycle,
researchers enrich the world little by little. This section focuses on papers which are a very
important part of the process of disseminating research findings.
Many procedures and norms in the scientific community were established before the advent
of the modern computing and the Internet. While the core research cycle is likely to persist in
some form, the approaches used to measure, summarize, and disseminate in research findings
are gradually evolving in response to technological innovation.
About Papers
Papers report on a specific subject matter, and are published in a scientific journal (figures
and videos included). Because a paper reports a new discovery by the researcher (or team),
Chapter 2 Before Starting Your Project 47
it is called a primary source. A textbook summarizes the essence of the results of papers into
a learner-friendly form, which is called secondary source. There are fee-charging papers and
free ones. A university is a public institute that provides access to primary information, i.e.,
papers, which allows you to do a wide variety of in-depth research. For example, access to
the latest issue of Nature if you were in your home and not logged on to a university would
cost you 3300 yen per letter. There are peer-reviewed papers and non peer-reviewed ones. In
addition to journal articles, there are papers published in international conference proceedings
(“international conference papers”). In some fields, papers of this type are peer-reviewed more
strictly than journal articles, and thus more appreciated.
Scientific Journals
A publication that includes more than one paper is called a journal, or proceedings, Since 1996,
many journals have gone to electronic publication.
Peer Review
A paper sent from the author or researcher to a scientific journal will not be published as
received. The received paper is first read by the editor, and then passed on to reviewers if it is
of interest and describing sufficiently high quality research. Usually, more than one reviewer in
the relevant field reviews the paper and returns comments to the editor, and the editor decides
whether to publish it. In most cases, the comments are fed back to the author, and the author
can revise the paper based on the comments. After one or more cycles of this process, the editor
decides to accept or reject the paper. This is called the peer-review system.
Pre-prints
Preprints are a means to rapidly share scientific manuscripts with the scientific community
prior to completion of peer review. Preprints are rapidly gaining adoption. More information is
available at YouTube: What are preprints?¹⁰ and Ten simple rules to consider regarding preprint
submission¹¹.
Types of Papers
The types of original papers for a typical journal include “communication” (about 2 pages),
“letter” (about 4 pages), and “article” (about 5 pages or more). Articles are the most common
type of publication. A review paper summarizes the state of the art and remaining challenges
in a specific field, citing related papers in the field. Remember that authors of reviews are still
humans and not free of bias when they lists references.
¹⁰http://bit.ly/2rMUyme
¹¹http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005473
Chapter 2 Before Starting Your Project 48
Sections of a Paper
Below are the sections of a typical full paper.
- Title: Important. It sometimes shows the author’s struggle to make it witty.
- Author(s): The first author is usually the one who actually wrote the paper. The corresponding
author(s) assume overall responsibility for the manuscript, including compliance with journal
policies, vouching for data veracity, and handling any questions that arise during and after
publication.
- Affiliation: Tells you were the authors work and do their research.
- Abstract: Very Important. If the content is not what you expected, look for other papers.
- Introduction: Describes the background of the paper. You can skip it if it’s familiar to you.
- Materials & Methods: Important. Describes details of the experimental methods used to
evaluate the hypothesis of the paper.
- Figure(s): Very important. Presents data.
- Results: Describes experimental results. As you get used to reading papers, sometimes you
can understand a paper by only looking at figures.
- Discussion: Important. The results of the paper are discussed. This often occupies a relatively
large portion in a paper written by a native English speaker.
- Conclusion: Usually no tricks here.
- Acknowledgment: Appreciation for those who provided advice, technical support, samples,
and financial support for the research.
- References: Very important. Papers cited here have been chosen because they are worth
discussing; i.e. the number of citations is sometimes more important than the number of
accepted papers.
- Supplementary Information: Important. Because experimental procedures nowadays are
getting more and more complicated, some results and details are too lengthy to be described
in the article body, and thus are provided separately as supplementary information. Sequences
for DNA origami are usually found here. Most videos are found here too. This section may even
include something like an essay on twists and turns of the study.
Value of a Paper
Writing a paper is fun. Publishing it is fun. Having other people citing it is fun. Researcher
productivity is measured by number of publications that person has. As mentioned above, a
paper’s value is measured by how much that paper enriches the world in terms of its originality,
reproducibility, and wide applicability. These criteria can be judged based on the number of
citations, impact factor (a measure of the relative importance of a journal), and so on. The
simplest way to collect information for your project is to investigate a specific subject matter.
On the other hand, because both research and publications are the result of human activity, you
may have fun tracking a favorite author or the author of a favorite paper. Every paper includes
the e-mail address of the corresponding author. If the paper interests you, you can ask for advice
from her/him via e-mail, which should be welcomed.
DNA Molecule
When asked what is DNA, most of you will answer, “genes.” This is a correct answer in a biology
class. However, have you ever thought of DNA simply as a “molecule” with a defined shape?
Chemically, DNA is just a molecule made of a long string (“polymer”) of “nucleotide” units, each
of which consists of a sugar, base, and phosphate. DNA in this form is sometimes called “single-
stranded DNA”.
There are 4 kinds of bases: A (adenine), G (guanine), C (cytosine), and T (thymine). A base
sequence is obtained by reading out bases in a single strand of DNA from one end to the other.
DNA has a directionality, with one of its ends called the 5’ terminus and the other the 3’ terminus
(based on the positions of two phosphates attached to a sugar), wherein its sequence is read
from the 5’ terminus (as seen from the figure, the direction is often indicated by an arrow).
“DNA=genes” because organisms recognize DNA base sequences as information that guides a
variety of biological activities including protein synthesis. Although DNA itself is just a molecule,
organisms use it as a “data storage medium” to record/read out a set of information that matters
to them.
As you know, the structure of DNA is a double helix. Two single strands of DNA that run in
opposite directions stick to each other along their length to create this form, in which the
Chapter 3 What Does Design of DNA Molecules Make Possible? 51
bases also play an essential role. Each base has a property of “complementarity”, so that only
a specific partner can stick to it through hydrogen bonding, i.e. A to T and C to G. Two single
DNA strands with mutually complementary sequences stick (hybridize) to each other along their
length through hydrogen bonding between bases, forming this double helix structure of double
strands. Strictly speaking, hydrophobic interactions between adjacent bases in the same strand
are also involved in the double helix structure formation, but we will not discuss that here.
In terms of engineering, you can expect that, even when a large number of DNA molecules
with different sequences are present in an aqueous solution, any of them will only stick to
a complementary DNA molecule to form a double strand, i.e. the double helix structure. See
sections 41 to 44 for more details.
Branched Structure
As an example, in the left Figure, the regions (a) and (c) in the first DNA have, by design, a
sequence complementary to a region of the fourth DNA and to a region in the second DNA,
respectively. Four different DNA molecules thus make this four-arm branched structure because
Chapter 3 What Does Design of DNA Molecules Make Possible? 52
only mutually complementary regions can hybridize to form a double helix. By dividing each
DNA molecule into multiple regions and designing each region to stick to a specific region in
the same or another DNA molecule, you can assemble not only a straight chain of double helix,
but also more complex two-dimensional structures. At present, the most commonly used DNA
nanostructure components include four-armed branches as described above (one without any
single strand region b is sometimes called a crossover junction) and similarly designed three-
armed branches.
Sticky Ends
Another important component of creating DNA nanostructures is sticky ends. Imagine two
double strand DNA molecules (red and blue in the figure), each provided with a single strand
overhang (a or b in the figure) in its terminus. If the base sequences a and b are mutually
complementary, the two double helices are joined together through the hybridization of a and
b. Such single stranded (i.e. a and b) regions are called “sticky ends”. This technique is frequently
used to join units into a large structure by the “glue”-like function that allows selective binding
between their termini.
Three Approaches
How should you apply these concepts, i.e. branches and sticky ends, to assemble a DNA
structure? Currently there are three major approaches to the assembly of structures using DNA.
1. Folding (e.g. DNA origami and tiles)
2. Network formation (e.g. DNA hydrogel)
3. Connector-like function (e.g. SNA)
These approaches can be distinguished by the different ways they direct a DNA molecule. In
the folding approach, DNA double helices are bundled to form a desired nanostructure. Each
double helix is regarded as a “tube”, and, in principle, the structure is designed to form by
joining the tubes. Techniques that rely primarily on folding include DNA origami, DNA tiles
composed of a variety of DNA motifs, and recently reported DNA bricks. DNA hydrogel is a
relatively new assembly approach, which considers a long double helix made of a plurality
of joined DNA molecules as a “flexible macromolecule structure”, not as a rigid tube. DNA
hydrogel combines such structures to form a network, or gel, on the macro scale. In SNA, which
is described in section 21, the selective binding ability intrinsic to DNA is more important than
Chapter 3 What Does Design of DNA Molecules Make Possible? 53
its structure, and DNA is used as “a glue with a length and a selective action”. In this approach,
DNA molecules are attached to gold nanoparticles to control the interaction between them and
thus the crystallization of the nanoparticles.
These approaches will be discussed in detail in the following sections.
—Shogo Hamada (Cornell University)
helix and 8 bases complementary to the third helix are added to the aforementioned 16 bases,
resulting in 32 bases total. As a result, in a DNA origami structure, the staple chains are bended
in an S shape and/or a Z shape to bind to the scaffold chain. This might seem unimportant, but
every helix has its handedness. Therefore, it is important to note that whether a staple chain is
arranged in an S shape or a Z shape can make a large difference. That is, when a staple chain
is arranged in an S shape, the termini of the staple chain will be the rear most objects in the
design. When it is arranged in a Z shape, they will be the fore most objects in the design. This
should be taken into consideration when the staple chain is to have an additional DNA chain
or the like hanging from it. In order to fold the 7,249-base scaffold chain by means of 32-base
staple chains, 7249/32 = 226.5 so 226.5 staple chains are required in the simplest calculation.
Because the M13mp18 genome sequence can be regarded basically as random, two hundreds
and several tens of DNA chains with mutually different sequences are required. Preparing their
mixture might feel like a hard job, but the beauty of the resulting structure will reward you for
all your trouble.
Design and AFM image of Smiley DNA Origami Structure. Reprinted with permission from
Nature, 2006.
In BIOMOD 2013, Team Kansai from Kansai University reported a “DNA Origami flipbook
cartoon”, composed of a total of 10 AFM images, of “the Rabbit and the Turtle”, and won the
gold prize. Their idea of minimizing the number of the staples by connecting the ground, turtle,
and rabbit through linkers was admired. The team also assembled a background layer of a
Chapter 3 What Does Design of DNA Molecules Make Possible? 57
roadside tree passing by via DNA origami, and got to the final in the Intercollegiate Science
Competition sponsored by MEXT.
Exemplary Hollow Three-Dimensional DNA Origami Structures. (a) Box-shaped DNA origami
structure [1]. Reprinted with permission from Nature 2009. (b) Regular tetrahedron DNA origami
structure [3]. Reprinted with permission from Nano Lett. 2009. (c) DNA origami structure from
our group [2].
Once the box-shaped structures as described above are opened, there is now way to keep them
in the open state. Therefore, their free motion in an aqueous solution provides a driving force
to open these structures. Firrao et al. reported an exemplary DNA origami structure that can
be opened/closed by a more active driving force. They created a tubular DNA origami structure
with a similar hatch that is driven by double strand DNA formation, and showed that it actually
worked.
Hollow three-dimensional DNA origami structures can include curved surfaces, by virtue of the
ability to bend DNA strands in a DNA origami [8]. For example, Yan et al. have successfully
assembled flask- and sphere-shaped DNA origami structures.
Chapter 3 What Does Design of DNA Molecules Make Possible? 59
double helix”). During the cooling, each set of DNA molecules in the solution forms a tile, after
which the tiles self-assemble to form a two/three-dimensional structure (i.e., in the case of DNA,
puzzle pieces themselves are also spontaneously formed in the solution). Motifs that make up
DNA tiles and an “algorithmic self-assembly” technique to form DNA tile-based patterns are
outlined below.
DNA Motifs
A “repeat unit” structure made of a combination of branched DNA structures is called a DNA
motif. The most representative motif used for DNA tiles is the double crossover (DX) molecule
structure. In an aqueous solution that contain magnesium ions or the like to suppress the
electrostatic repulsion between DNA molecules, a four-arm branched (crossover) structure can
have a closed conformation as shown in the left or right in panel a. However, in this state,
the conformation cannot be unequivocally specified because the branch still can move freely
like a hinge. In panel b, two branches are provided between two double helices to fix their
conformation, which is called a double crossover.
Reprinted with permission from Angew. Chem. Int. Ed. ©2009. Reprinted with permission from
[7].
Chapter 3 What Does Design of DNA Molecules Make Possible? 63
Algorithmic Self-Assembly
When DNA tiles have only one kind of sticky end, you will get a uniform two-dimensional planar
structure made of identical tiles. What if more than one kind of sticky end is present: e.g., two
sticky end pairs 1 and 0? When the right side of a tile A can be joined to the left side of a tile B
through 1 and the right side of B to the left side of A through 0, a structure composed of alternate
repetitions of A and B will form (panel. a).
Reprinted with permission from Natural Computing ©2008. Reprinted with permission from [8].
Chapter 3 What Does Design of DNA Molecules Make Possible? 64
When the DNA tiles self-assemble on the initial structure (b), which includes only one site 1, a
complex pattern made of triangles (c) will form [9]. This is called a Sierpinski triangle. Likewise,
various different patterns are possible by adopting different “rules” of tiles. This process is the
same as a one-dimensional cellular automaton. The lower sides of each tile correspond to an
input, and the upper sides to an output. The right and left lower sides are designed to define
the output from the right and left upper sides (the rules in this figure correspond to the XOR
operation). The self-assembly of such DNA tiles, which can perform computation based on sticky
ends and output various patterns, is called “algorithmic self-assembly”.
Further, it is possible to have every connection differ in stick- end sequence from all the rest,
so that the positional relationship between every two tiles (thus the final structure) is unequiv-
ocally defined. This can be implemented in single-stranded tile (SST) to make any two/three-
dimensional structure, which is called a “DNA brick” (see column 1).
Two-dimensional DNA bricks and reported examples. Reprinted with permission from Nature
©2012. [15]
Crystal structure made of DNA bricks. Reprinted with permission from Nature ©2014. [17].
In addition to DNA tiles, origami, and bricks, which are two/three-dimensional structures made
of double helices, it is possible to obtain a DNA-based “closed structure”.
(a) A cubic structure [10] can be made by ligating a plurality of circular DNA molecules of
catenane topology. (b) An octahedron structure [11] can be obtained by folding a long single
strand DNA made of chemically synthesized multiple DNA molecules, in the presence of other
short DNA strands. (c) A tetrahedron structure [12] can be formed from four single DNA strands
each of which constitutes three of the tetrahedron edges and is complementary to one of
the other strands along each edge. (d) A spherical structure can also be made using the T-
branch motif as described above [13]. By carefully choosing the concentration for annealing,
tetrahedron, dodecahedron, and buckyball-like structures were obtained.
(a) to (d) drawn based on [14], [11], [12], and [13], respectively.
case of an SNA, the spherical (or geometrically anisotropic) particle determines the stiffness.
However, in an SNA, DNA molecules with sticky ends are attached to the entire particle surface.
By virtue of interactions between this DNA population, the concept of “valence” or “bond” at
the atomic level can be directly applied to the reactions between the colloidal particles. As
a result of using the highly designable hybridization between DNA molecules in inter-particle
interactions, programmable colloidal particles with “atom-like behavior” can be designed [3],
which will allow for control of the crystal structure itself.
exemplary gel structures, methods to characterize individual DNA hydrogel products, and some
applications.
Introduction
Among DNA-based structure assembly techniques, DNA hydrogel seeks to make macro-scale
(i.e. micro-to-centimeter order) structures by forming a network in a solution. This section
outlines previously reported DNA hydrogels, and their manufacture, characterization, and ap-
plications.
Chemical and Physical Gels
In general, a gel is classified into one of two kinds depending on the crosslink mechanism of its
network.
One class is a chemical gel. A chemical gel is formed by covalent crosslinking. As a result, as
long as its component molecules are unbroken, the whole network will remain intact (will not
Chapter 3 What Does Design of DNA Molecules Make Possible? 69
melt). A well-known example of chemical gel solely made of DNA has a motif-based design as
described above [1]. Specifically, sticky ends of X, Y, T, or other branched structures are joined
by an enzymatic reaction using ligase. This allows a robust network of double helices to be
produced.
Another class is the physical gel. A physical gel is formed by a non-covalent crosslink mechanism
(e.g., hydrogen bond or physical entanglement). The earliest examples of physical gel prepared
from chemically synthesized DNA include one designed by Liu et al. in a motif-based manner
[2], which has been formed by connecting Y branches using a double strand linker, both
ends of which are sticky ends. In another example, a gel has been obtained using a non-DNA
macromolecule as the network backbone in combination with the specific connection property
of DNA hybridization [3]. Recently, 3D printing was applied to such a physical DNA hydrogel to
obtain desired two/three-dimensional shapes at the macro scale [4].
Another technique to produce a DNA physical gel depends on DNA polymerase enzyme [5] (DNA
polymerase extends a DNA strand complementary to a template, with a primer being the starting
point), which is summarized below. First, a circular template DNA is prepared, to which a primer
DNA (primer 1) complementary to a portion of the template sequence is hybridized, and then
Chapter 3 What Does Design of DNA Molecules Make Possible? 70
DNA polymerase starts the extension reaction of primer 1. This polymerase has strong strand
displacement activity, resulting in a long single strand DNA product that is composed of multiple
repeats of the sequence complementary to the template. This process is called RCA (Rolling
Circle Amplification). This first reaction is allowed to proceed for a sufficiently long time, and
then primer 2 (which is complementary to a portion of the product of primer 1), and primer
3 (which has a sequence identical to primer ), are added to the solution. Primer 2 starts an
extension reaction using the extension product of primer 1 as a template. Primer 3 hybridizes to
the initial circular template DNA or the product of primer 2 to start another extension reaction.
The resulting DNA molecules become highly entangled with each other forming a gel network.
Chapter 3 What Does Design of DNA Molecules Make Possible? 71
Characterization
Is the structure obtained as above truly a gel? In most cases, the answer to this question is
determined by dynamic viscoelasticity analysis using a rheometer. In dynamic viscoelasticity
analysis, a sinusoidal stress is applied to a given sample and the strain response is measured.
Based on their phase difference and amplitude, the storage modulus G’ (the elastic portion; the
degree of solid-like behavior of the sample) and the loss modulus G” (the viscous portion; the
degree of liquid-like behavior of the sample) are calculated. G’ > G” means that the sample is a
gel.
Applications
One of the most promising applications of DNA hydrogel is the incorporation of a gene-containing
DNA such as a plasmid into a gel network to promote cell-free protein synthesis inside the gel [6].
This system has proved to be capable of synthesizing protein about 300 times more efficiently
than a solution system. This might be due to the protection of the molecular structure of the
Chapter 3 What Does Design of DNA Molecules Make Possible? 72
gene by the DNA hydrogel, a higher concentration of the gene in the gel compared to a solu-
tion system, more efficient turnover of the enzymatic reactions (thus improved transcription
efficiency), etc.
—Shogo Hamada (Cornell University)
Molecular Machines
A molecular machine is defined as “a molecule that goes through some directional movement
in response to an input stimulus”. In general, it has a plurality of states and repeats the transition
between them.
An organism is made of innumerable, highly sophisticated molecular machines. A wide variety
of protein-based molecular machines are at work in every cell. For example, molecular motors
essential for bacterial flagellum rotation are driven by the proton concentration gradient across
the cell membrane. Muscle contraction is the result of the sliding of actin and myosin protein
filaments, which depends on the energy of ATP hydrolysis. As another example, F-type ATP
synthase, which is responsible for ATP synthesis in most organisms, itself rotates to carry out
the synthesis reaction. These natural biological molecular machines are highly efficient.
On the other hand, artificial molecular machines employ a catenane (multiple ring-shaped
molecules mutually interlocked like a chain), a rotaxane (a dumbbell-shaped molecule threaded
through a macrocycle, interlocked by the ends of the dumbbell), and the like that function as
joints, bearings, and other movable molecular elements. These machines include molecular
shuttles, molecular tweezers, and nanocars. A nanocar has four wheels made of fullerene, and
each axle can endlessly rotate on a covalent carbon-carbon bond. A nanocar has been reported
to be driven straight over a gold surface by thermal fluctuation [1].
Chapter 3 What Does Design of DNA Molecules Make Possible? 73
DNA Nanomachines
DNA allows more complex motions to be achieved, and even for these motions to be controlled.
Driving mechanisms for DNA nanomachines include the following.
- Driving Mechanisms for DNA Nanomachine:
- External
- Strand displacement (DNA input)*
- B-Z transition (salt concentration change)
- Light-responsive base (light input)
- Autonomous
- Enzymatic reaction (DNAzyme, restriction enzyme)*
- Strand displacement (with autonomous DNA input)*
*sequence-specific (only driven by a particular sequence).
In an externally driven system, a nanomachine moves only one step every time it receives a
certain input signal provided externally. In an autonomous system, movement happens au-
tonomously without such external signals (or it appropriately generates the required molecular
signal).
Chapter 3 What Does Design of DNA Molecules Make Possible? 74
The most common external driving mechanism is strand displacement (see section 42). In
this case, different DNA single strands are added to the solution by pipetting, one at a time.
Each DNA single strand added displaces a part of the DNA nanomachine structure, causing the
nanomachine to change its shape (this constitutes one motion). A representative example is
molecular tweezers (see section 24). One advantage of the strand displacement-based driving
method is that you can choose the site of displacement. Other variations include walker (see
section 25) and pliers (see section 26). At present, the most complex nanomachine driven by
strand displacement is the nanofactory developed by Seeman’s group (see section 26).
Chapter 3 What Does Design of DNA Molecules Make Possible? 75
The B-Z transition driving method relies on the fact that a DNA molecule with the base sequence
CGCGCG… flips from the right-handed B form to the left-handed Z form and vice versa, depend-
ing on surrounding ion concentration. You can change the ion concentration of the solution by
repeatedly pipetting so that the machine repeats the flipping motion. Another nanomachine
depends on a light-responsive artificial base such as azobenzene (see section 48). Because the
azobenzene molecule changes its form upon light irradiation, UV irradiation can control the
association/dissociation of an azobenzene-containing double stranded DNA. The B-Z transition
and light-responsive base driving methods are not suitable for complex motions, because a
single input causes all actuators to move at the same time.
There are relatively few examples of autonomous molecular machines. One example is the
autonomous walker (see section 25). The walker achieves a walking motion by binding to an
array of track DNA molecules provided on a two-dimensional DNA origami (see section 18), and
then cleaving the bound track DNA by means of a DNAzyme sequence contained in each leg (see
section 30). The walker does not require any external input because it autonomously repeats the
cycle of taking one step and binding the next track DNA upon the cleavage of one track DNA.
Slime Robot
Any of the molecular machines described above works as a single molecule in a solution, by
changing its shape and/or binding state stepwise. A more elaborate system, the Molecular
Robotics Project (see column 1) seeks to achieve a slime-like motion by enclosing a DNA
computing circuit and a molecular actuator in a reactor such as a liposome (see section 89).
The behavior of microtubules can be controlled by attaching a DNA tag to them, to which an
autonomous DNA reaction system (see section 27) will be applied in the future [2].
Chapter 3 What Does Design of DNA Molecules Make Possible? 77
Slime Robots [2]. Reprinted from Accounts on Chem Res. ©2014 ACS.
—Satoshi Murata (Tohoku University)
double strand state (see section 40). This self-assembling property can produce two/three-
dimensional nanostructures (see section 17). If such a DNA nanostructure can change its
structure repetitively, it means that a nano-sized machine can be achieved.
For example, a change in the pH and ionic strength of the environment can cause structural
change in a molecule. However, it will not be so easy to obtain a set of different molecules that
will exhibit different structure changes in response to a given environmental change and com-
bine them to perform a complex mechanical task. It is also difficult to have one environmental
change control only one of the different molecular machines or drive all of them cooperatively.
Is it possible to reversibly change a structure assembled by means of the sequence-specific
hybridization of a DNA molecule?
Thermodynamically, the self-assembly of DNA into a nanostructure proceeds toward a more
favorable state in terms of free energy. By appropriately arranging complementary sequences
in the design, a variety of structures can form spontaneously upon just mixing DNA molecules.
Theoretically, after a structure has been obtained, another DNA strand capable of binding to a
DNA strand in the structure in an energetically more favorable state can be externally added to
cause a structural change. However, for the structural change to be reversible, the base-paired
strands have to dissociate again and form base pairs with another strand, which involves an
unfavorable state in terms of free energy and will take an unimaginably long time. Thus, we
used to think it was impossible to operate such a machine on a practical time scale. However,
the “DNA tweezers” built by Yurke et al. [1] accomplished this.
Chapter 3 What Does Design of DNA Molecules Make Possible? 79
Reprinted with permission from C. M. Niemeyer, M. Adler, Angew. Chem. Int. Ed. 2002, 41, 3779.
DNA tweezers are driven by the sequential addition of two DNA strands serving as a signal that
directs the structural change and works as a fuel to create a simple structure made of three DNA
strands. The tweezers-like structure, two rigid rods connected by a hinge, is assembled by the
hybridization of one DNA strand (Ta) and two other DNA strands (Tb and Tc) each of which carries
a sequence complementary to the 5’ or 3’ end of Ta (top left in the figure). The two rods each
have a single strand overhang. When a DNA strand (Fc) that is complementary to both overhangs
Chapter 3 What Does Design of DNA Molecules Make Possible? 80
is added, it hybridizes to the rods to turn the tweezers into a closed structure. The added DNA
can be regarded as a “signal” to direct the structural change in a sequence-specific manner and
a “fuel” to drive the structural change toward a favorable state in terms of free energy.
If the bound Fc strand is capable of being dissociated from the structure by means of another
signal/fuel DNA strand, it means that a reversible mechanical task can be achieved. In the
structure shown in the bottom of the figure, a single strand overhang in the closed structure,
“toe-hold”, is bound by a DNA strand (Fo) that includes a sequence complementary to it. Fo
also carries a sequence complementary to a sequence in the double strand formed between
Fc and the tweezers, causing base pair displacement. Base pair displacement is a process with
no substantial change in free energy, which proceeds in a random-walk manner. When all the
base pairs in Fc have been displaced, the double strand structure formed between Fc and Fo is
released as waste, and the DNA tweezers return to the original open structure. Such a “strand
displacement reaction” is now employed as a driving mechanism in many DNA nanomachines
and enzyme-free DNA logic gates.
Recent Progress
DNA tweezers need the addition of a DNA solution to perform each open/close motion. When
the reaction solution is a closed system, this step also results in dilution of DNA tweezers; i.e.,
repeated additions of the DNA solution causes a gradual decrease in operating efficiency. The
accumulation of wastes may also induce unexpected interactions. Asanuma et al. have reported
an azobenzene-based optical control of the hybridization that activates the open/close motion
of DNA tweezers by UV light, without any addition of DNA [2].
Yan et al. made a larger, tweezers-like structure from DNA and attached an enzyme and its
cofactor to its tips, to build a nanoreactor in which an enzymatic reaction was controlled by
Chapter 3 What Does Design of DNA Molecules Make Possible? 81
the open/close motion of the DNA tweezers [3]. When the DNA tweezers are closed and the de-
hydrogenase and NAD+ come into proximity, the enzymatic reaction is promoted by more than
5-fold compared to that in the open state, demonstrating the control of an enzymatic reaction
by DNA tweezers. Expected future applications of this include more complex mechanical tasks
and elaborate reaction control systems based on DNA tweezers.
—Ken Komiya (TITECH)
Walking Molecule
To operate the DNA tweezers described in the previous section, a DNA nanostructure repeats
an open/close motion. In the “DNA walker”, a nanostructure moves from one place to another.
What does “walking a molecule” mean? Under the earth’s gravity, our body stays in one place on
the ground unless something happens. However, in the world of nanometer-sized molecules, a
molecule in solution will be blown from place to place by Brownian motion, unless something
happens. A molecule thus needs to bind to a “scaffold” to walk. Organisms have networks of
scaffold for motor proteins to walk around, which is important for the transport of molecules
and the like. On the other hand, the scaffold for DNA walker is also made of DNA. A DNA walker
moves on such a DNA scaffold structure by repeating the cycle of binding to one site on the
scaffold, then binding to the next site, and detaching from the previous site.
a strand displacement reaction that releases A1 from the leg of the walker and the stator,
resulting in the walker again standing on one leg. This walking motion has been demonstrated
by modifying the toe and the stator with different fluorescent groups and observing it by
fluorescence spectrometry (see section 81).
Chapter 3 What Does Design of DNA Molecules Make Possible? 83
Chapter 3 What Does Design of DNA Molecules Make Possible? 84
Fig. 1. DNA strands A1, A2, and D1 are highlighted in yellow, pink, and grey, respectively.
Reprinted with permission from Reference [1]. Copyright ©2004, American Chemical Society.
Fig. 2. Reprinted with permission from J. Bath, A. Turberfield, Nature Nanotech. 2007, 2, 275.
Dark green: DNA walker; light green: stator DNA; Black arrowhead: cleavage site.
When this single-strand DNA walker hybridizes to a stator, it produces a nicking enzyme recogni-
tion site in a double strand state where the stator DNA can be cleaved by the nicking enzyme. The
Chapter 3 What Does Design of DNA Molecules Make Possible? 85
short DNA fragment derived from the cleavage has a low melting temperature, spontaneously
dissociating from the walker. Because all the stators have an identical sequence, the newly
exposed single strand region in the walker serves as a toe-hold to induce a strand displacement
reaction, moving the walker to a neighboring stator. Then the cleavage reaction occurs again,
causing the walker to proceed unidirectionally on the straight track of stators. This process has
been confirmed by fluorescence spectrometry, after which the walking motion was observed
by video recording under high-speed AFM for a single molecule [3].
The “DNA spider” by Lund et al. is another example of a DNA walker moving on a DNA origami
[3]. The DNA spider is a streptavidin protein modified with three uniquely designed DNA legs
that have an RNA cleaving activity; it moves autonomously on multiple RNA scaffolds provided
on a DNA origami by degrading them one after another.
Turberfield et al. prepared a track of single strand DNA array on a DNA origami, on which
their enzymatic reaction-based DNA walker moved [4]. Using high-speed AFM (1 frame/sec),
they succeeded in recording the world’s first video of the motion of a DNA walker. They also
introduced a fork in the track, at which the moving direction of the DNA walker was controlled
by an input DNA strand [5].
As another example of detecting a target based on the structural change of a DNA origami device
induced by interaction, Mao et al. reported a protein detection system based on optical tweezers
[10].
Autonomous Reactions
First, we define the meaning of “autonomous” in this field, because it is a somewhat ambiguous
word. In general, in chemical synthesis, a non-autonomous reaction means a stepwise reaction
that requires external intervention at each step, e.g., “after one step, add reagents for the next
step to occur, after that, add reagents again…” In contrast, an autonomous reaction is a stepwise
reaction that, once reagents have been added to start the reaction, proceeds in accordance
with information programmed in a DNA base sequence or the like to achieve the desired result
(calculation result, synthesized molecule, structural change, motion, etc.). Below are exemplary
autonomous reaction systems.
DNA-Based Computation
Various kinds of molecule-based computation have been proposed and attempted for many
years. One is the reversible molecular Turing machine proposed by Bennett in 1982. Other well-
known examples include those by Conrad, Vaintsvaig, and Liberman, etc. Feynman, who won
the Nobel Prize in Physics, also pointed out the possibility of molecular machines in his speech
titled “There’s Plenty of Room at the Bottom” in 1959.
Adleman was the first to demonstrate the ability of molecules to perform an actual computa-
tion. In 1994, Adleman solved a directed Hamiltonian path problem by ingenious use of DNA
reactions, which launched the field of DNA computing. DNA computing in a narrow sense (i.e.
classical DNA computing) refers to the search and optimization techniques that are based on
Adleman’s solution method. However, the landscape of recent DNA computing research reveals
Chapter 3 What Does Design of DNA Molecules Make Possible? 94
that the implementation of various other computation models has also been attempted using
molecular reactions, especially of DNA. In general, DNA computing is a field that seeks to use
DNA reactions to implement computation models.
The diagram below includes a list of popular computation models that have sought to im-
plement DNA computing. In the course of attempts to implement these popular computation
models using DNA reactions, some have proposed general computation models inspired by DNA
reactions.
Adleman’s, and other similar solution methods, are each composed of multiple steps, and thus
involve many computation models. In particular, the step of generating paths in a digraph can
be regarded as a process of generating strings that represent the paths in accordance with
grammatical rules.
DNA Reactions
In order to implement computation models as described above, a set of information (data and
computation states) has to be represented by DNA molecules, and converted by DNA reactions.
The DNA molecules represent information primarily by their base sequence and/or secondary
structure, formed through hydrogen bonding. Then the represented information is converted
by a reaction that edits the base sequence and/or a reaction that alters the secondary structure
as follows.
Association (hybridization): formation of a double strand by mutually complementary single
strands.
Dissociation: reverse reaction of association; a double strand resolves into two single strands.
Restriction enzyme cleavage: cleaves a DNA double strand by a restriction enzyme at its target
site (i.e. recognition sequence). Some restriction enzymes cleave only one strand in a double
strand, producing a nick.
Ligation (ligase-mediated linkage): forms a covalent bond at a break or nick in a double strand.
Polymerase extension: extends the 3’ end of one strand in a double strand, using the other strand
as a template.
Strand displacement (branch migration): replaces one strand in a double strand.
Chapter 3 What Does Design of DNA Molecules Make Possible? 95
Chapter 3 What Does Design of DNA Molecules Make Possible? 96
Logic Circuit
Logic circuit is one of the most basic computation models. Why? A bit is the most fundamental
data unit. Thus a function that receives a bit sequence as input and generates a bit as output
(Boolean function) provides the most basic operation. Such a function is implemented by logic
circuit. This type of logic circuit is called a “combinatorial circuit”. A sequential circuit is another
type of logic circuit, one that has a memory section. This section will focus on combinatorial
circuits
In mathematics, a logic circuit is a digraph without cycles. In general, the number of directed
edges entering a node in a digraph is called the indegree of the node, while the number leaving
it is called the outdegree. In a logic circuit, a node is classified as an input node or logic gate. The
indegree of an input node is 0. Basic logic gates include AND, OR, and NOT. AND and OR have an
indegree of 2, while NOT has an indegree of 1. One of the logic gates is defined as an output
node. The length of the longest path from the input node to the output node is called the depth
of the logic circuit.
From the viewpoint of computational complexity, the logic circuit as a computation model has
been extensively studied. For example, the “NC” class of computational complexity has been
defined. When a Boolean function family with an input of n bits belongs to NC, it means that
there is a logic circuit family to implement the function, with the number of gates being a
polynomial of n and the depth of the circuit being a polynomial of log n. Note that a Turing
machine with limited computational resources exists, and receives n as input and generates
(the representation of) a logic circuit for an input of n bits as output.
Implementation by Molecules
Implementing logic circuits by molecular reactions is the most basic subject in the field of
molecular computing. In order to design a molecular reaction-based logic circuit, you must
define how to represent the bits 0 and 1, first of all. Generally, a bit can be represented by
voltage, light intensity/wavelength, magnetic field direction, temperature, and many other
physical phenomena. You have to assign a specific molecule species to each gate in order to
Chapter 3 What Does Design of DNA Molecules Make Possible? 97
make a plurality of molecular reaction-based gates operate in a chain reaction, with an output
of one gate being an input to another gate.
Accordingly, in the most basic case, different molecular species are assigned to different gates.
For example, molecular species G is assigned to gate g. When the output of gate g is 0, it is
represented by the absence of molecular species G (concentration = 0), and when the output
of gate g is 1, it is represented by the presence of molecular species G (concentration ≥ a
predetermined threshold). Theoretically, an output of 1 can be represented by as little as one
molecule. Practically, a threshold concentration is defined so that an output is determined to
be 1 when the concentration of G is equal to or more than the threshold. It’s best to have the
concentration as close to 0 as possible when the output is 0. This representation method is
called single-rail representation in comparison with dual-rail representation, which is as follows.
In so-called dual-rail representation, two molecular species G and H are assigned to each
gate g. An output of g of 1 is represented by the presence of molecule G (concentration ≥
a predetermined threshold), whereas an output of g of 0 is represented by the presence of
molecule H (concentration ≥ a predetermined threshold).
Enzyme-Free Logic Circuits of Seelig
Below is a schematic of an enzyme-free logic gate by Seelig et al.
In this figure, an AND gate is implemented. Gin and Fin are inputs. First, Gin binds to the toe-
hold in the single strand G in the gate, which results in a double strand formation between Gin
and G by strand displacement reaction. Then the newly exposed toe-hold is bound by Fin, which
results in a double strand formation between F and Fin by strand displacement reaction. As a
Chapter 3 What Does Design of DNA Molecules Make Possible? 98
result, when the inputs Gin and Fin are both present, the output Eout is released as a single
strand.
The single stranded (or partially single-stranded) DNA molecules each represent a bit, and the
gate operates by generating new single strands depending on the combination of the inputs. The
strand displacement reaction to form a stable double strand is irreversible, i.e. proceeds only
unidirectionally. Therefore, this system cannot drive a reaction by the absence of a molecular
species, making it hard to implement a NOT gate in single-rail representation. When NOT is
present, dual-rail representation is required.
Reversible Logic Circuit of Genot et al.
Genot et al. has implemented a reversible logic gate as shown in the following figure, which is
also an AND gate.
Each of the two inputs (Fin and Gin) can alter the structure of the AND gate, but the single strand
region Eout as output is exposed only when both inputs are present. These reactions, though
with different reaction rates, each can proceed in either direction. Thus, when one of the inputs
is removed, the reaction proceeds in the reverse direction and the output is lost.
Therefore, a NOT gate is also possible in single-rail representation. A single strand G, which
represents the NOT gate, is put in the system in advance. An input to the NOT gate is a single
strand complementary to G. The input can form a double strand with G to remove it from the
system, meaning that G has been denied. Once G is removed, the gate reaction in the later step
proceeds reversibly, so that the logic circuit operates correctly.
—Masami Hagiya (Tokyo University)
Chapter 3 What Does Design of DNA Molecules Make Possible? 99
Schematic of DNAzyme
DNAzyme-Based Systems
A DNAzyme is not only applicable to a metal ion detection system, but also to more complex
systems that are driven by DNA inputs. Typical examples include walkers and logic gates. In a
Chapter 3 What Does Design of DNA Molecules Make Possible? 100
walker system, a DNAzyme repeats the degradation reaction, causing the site of degradation to
move around [4] (see section 25).
A DNA logic gate has been implemented so that a substrate is cleaved only when a desired
hybridization has occurred [1]. Fig. 2 shows an AND gate. In the initial state, the hairpin struc-
tures in the top and bottom portions of the red DNA prevent its hybridization to the substrate.
Either input can open one of the hairpin structures, which is still not sufficient for the substrate
to be cleaved. Only in the presence of both inputs can the substrate be cleaved to emit the
fluorescence as output.
On the contrary, a system that is initially active but is inactivated by an input can be constructed
by placing a similar hairpin in the active site. In addition to AND, (non-implication) and XNOR
have also been reported [1].
Chapter 3 What Does Design of DNA Molecules Make Possible? 101
A cascade of such DNAzyme logic gates has also been attempted [5]. The system is schematically
shown in Fig. 3. Each cleaved product activates or inactivates the next cleavage reaction,
allowing a particular combination of inputs to generate a fluorescence as output.
Fig. 3. Multilayered DNAzyme logic gate cascade [5]. Reprinted with permission from Nature
Publishing Group, 2010.
DNAzyme-based tic-tac-toe
Reprinted with permission from American Chemical Society, 2006 and Angewandte Chemie
International Edition, 2014.
—Ibuki Kawamata (Tohoku University)
the state represented by Intermediate 1 (bottom). The domain C is now in a single strand
state, to which the fuel molecule hybridizes, inducing branch migration to form the structure
intermediate 2 (middle right), generating the output molecule. Then the denaturation reaction
releases the catalyst and a waste product (shown in purple and blue), and the latter cannot
participate in the reaction any more (top right).
The activity of the catalyst is not affected by the reaction. Once released after the reaction, it can
drive another cycle and function endlessly as long as the substrate is present.
2 hours. This means that an output of more than 50% was obtained from an input of 10%; i.e.
one input drove 5 cycles or more on average.
Some researchers have reported a two-level gate system such as an autocatalytic reaction
with the output and input having an identical sequence. Experimental and model analysis
results of the two-level system suggests that 12 hours of the reaction can even determine the
presence/absence of the input molecule at a concentration of 0.01% (in chemical equivalent).
However, the autocatalytic reaction generates some amount of the output also in the absence
of the input (leak reaction) as another consequence of the exponential amplification.
Entropy
What kind of energy is driving the catalytic amplification reaction system in Fig. 1? Let’s examine
the catalytic reaction step as shown in Fig. 3. Before the reaction, two molecules (i.e. substrate
and fuel) are present. The addition of the catalyst drives the reaction, which results in three
molecules (i.e. waste, signal, and output).
Chapter 3 What Does Design of DNA Molecules Make Possible? 105
When looking at the hybridized domains, all the domains B, C, and D are bound before and after
the reaction. In other words, the number of hydrogen bonds stays the same before and after the
reaction. Thus, the number of hydrogen bonds, which affects the enthalpy term (see section 37)
of free energy (see section 38), does not matter much in this reaction.
What does not stay the same after the reaction is the number of molecules (from 2 molecules
to 3 molecules, as described above); i.e., the degree of freedom of molecules becomes higher,
increasing the configuration entropy (see section 37). An experiment with a shorter fuel is
described in the paper, to show that it is really entropy-driven.
Because it is an entropy-driven system, the reaction is relatively resistant to the influence
of temperature and salt concentration on DNA hybridization. Thus, various applications are
conceivable besides amplification.
Reprinted from the Journal of the Royal Society Interface, 2011 [4] (open access).
2) Output
The following figure shows a chain reaction of a strand displacement reaction and is the origin of
the name seesaw gate. In the figure, reactions indicated by solid lines are predominant, whereas
those indicated by dotted lines are reverse reactions. First, when the input DNA meets a double
strand DNA included in the gate (gate:output), it undergoes a strand displacement reaction to
generate a single strand output DNA and a double strand DNA (gate:input). Gate:input exposes
a toe-hold sequence T’, to which a fuel DNA binds to undergo another strand displacement
reaction, with gate:input, converted into the input and a double strand DNA (gate:fuel). The
first strand displacement reaction is a material flow from the top to the bottom of the figure,
and the second reaction is from the bottom to the top…, constituting a “seesaw”-like reaction
cycle. Most importantly, the input is consumed in the first strand displacement reaction but
supplied in the second strand displacement reaction. Generally, in a circuit design, you must
take care that individual circuits will not only work correctly alone but also in combination
Chapter 3 What Does Design of DNA Molecules Make Possible? 109
when connected to each other. Such independence of performance of individual circuits from
connection is called modularity [2].
Reprinted from the Journal of the Royal Society Interface, 2011 [4] (open access).
Seesaw gates are highly modular, because they can generate output DNA without consuming
too much input DNA in the reaction cycle. Look at a circuit of a larger scale achieved by
combining the logic gates constructed in [1] for a comparison.
produced proteins, in turn, can activate other genes or alternatively suppress them. Multiple
genes influence each other in a complex manner, creating a Gene Regulatory Network (GRN).
Complex behaviors of GRN such as oscillation [1] and memory [2] can be reproduced by
engineered systems. However, the process of DNA -> RNA -> protein includes multiple steps so
that the reaction may take a long time and may be difficult to control. Montagne et al. proposed
the following strategy to build a similarly complex system by combining DNA and enzymes [3].
A long “template” DNA, about 25 bases long, serves as a gene and synthesizes a short DNA by an
enzymatic reaction; the short signal DNA hybridizes to its target, i.e. the template, to activate or
suppress its function; the signal DNA is constantly degraded by exonuclease, so that the system
is kept dynamic.
The third module for inhibition is driven by a type of signal DNA called an “inhibitor”. The
inhibitor can bind to more than one site in the target template, but does not elicit polymerase
or nickase reactions (due to sequence mismatch and incomplete recognition sequence, respec-
tively). Until dissociation of the inhibitor, the template is kept inactive. Note that the inhibitor
cannot be used as an input DNA.
Based on this platform, Montagne et al. developed the oscillator shown in Fig. 3 [3], and Padirac
et al. developed a two-state circuit and two-state switch [4]. The performance of their systems
has been experimentally verified. Theoretically, more complex systems have been proposed [5].
Chapter 3 What Does Design of DNA Molecules Make Possible? 113
Fig. 3: Oscillator and experimental results by Montagne et al. Reprinted from Molecular Systems
Biology, 2011 [3] (open access).
Simple Model
Because a DNA toolbox includes a non-linear reaction, it is hard to completely predict the
behavior of the system. For minimal refinement, the template is regarded as a black box as
shown in Fig. 4. The black box is a simple converter based on the reaction rate of Michaelis-
Menten. This model assumes that hybridization and denaturation (association and dissociation
between complementary DNA molecules, respectively) reach the equilibrium much faster than
the enzymatic reaction does. As a result, the equation in Fig. 4b is obtained for each module.
Based on the graph representation, an equation that represents the entire system can be de-
rived easily by adding reactions of templates that correspond to different signal DNA molecules
together. The degradation by the exonuclease is incorporated as an approximation by the first-
order reaction shown in Fig. 4d.
Fig. 4: Representation and Equation of a DNA Toolbox as a Black Box. A vertical line indicates
restriction, [ ] indicates concentration, and � and � are parameters; exo means exonuclease
Chapter 3 What Does Design of DNA Molecules Make Possible? 114
reaction.
Such modeling of DNA toolboxes was proposed by Padirac et al. for the first time [4]. The
equations are simple enough to allow a large system to be simulated in a very short time. This
advantage has been helpful in the evolution of a complex DNA toolbox system, which requires
several thousand cycles of evaluation. In the course of its evolution, a system with an interesting
pattern was algorithmically found [7].
reactions could solve mathematical problems (see section 36). Here, a DNA molecule is regarded
as a “recording tape” that contains a string of symbols, and each computation step is driven
manually. This type of DNA computing is called “classical” or “first generation”. On the other
hand, “second generation” DNA computing, or “molecular programming”, executes staged
computation by a series of reactions that proceed autonomously in a solution, for which several
computation models have been proposed.
When a DNA nanostructure is altered stepwise in a series of reactions, these structures can be
regarded as different “states” in computation. If a given structure induces a particular reaction
to change itself into a different structure, it can be regarded as the nanostructure referring to its
internal state to execute a computation. Recently, much work has been done on DNA logic gates
that is based on the strand displacement reaction of DNA (see section 29). This section will focus
on two enzymatic reaction systems that implement a “finite state machine” with internal states:
i.e., a finite automaton based on restriction enzymes, proposed early in the DNA computing field,
and a finite state machine that executes computation by repeating an extension reaction of DNA
polymerase.
DNA Automaton
“Finite automaton (FA)” is a virtual computation model that has internal states and a state
transition function. When FA receives a symbol string as input, it reads out the symbols se-
quentially to execute state transitions by referring to its state. When the final state is an accept
state, the symbol string is “accepted”. For example, in the following figure, upon input of the
symbol string “baba”, FA reads out the leftmost b, and a transition from the initial state S₀ to S₁
occurs according to the state transition function. Then the second a (from the left) is read, and
a transition from the state S₁ to S₁ occurs. When the third b is read, a transition from the state
S₁ to S₀ occurs, and when the fourth a is read, a transition from the state S₀ to S₀ occurs. In this
example, the state S₀ is an accept state, and FA accepts the input baba.
FA that accepts a symbol string containing an even number of bs. Each circled letter indicates
a state, and the double circle indicates an accept state. Each small letter above an arrow
represents a state transition function.
Shapiro et al. implemented FA in a reaction system that was composed of DNA and Fok I, a type
IIs restriction enzyme [1]. Type II restriction enzymes recognize a specific sequence in a double
Chapter 3 What Does Design of DNA Molecules Make Possible? 116
strand DNA, and cutss both DNA strands at a specific position (see section 47). Among type II
restriction enzymes, type II enzymes cut DNA outside their recognition sequences. In the figure
below, the “input DNA” encodes an input symbol string and has a terminal overhang sequence,
to which a complementary terminal overhang sequence in the “state transition function DNA”
can hybridize to cause the cleavage of the input DNA by Fok I. This corresponds to reading one
symbol to execute a state transition. The cleavage produces a new terminal overhang in the
input DNA, the next symbol is read out like before, and another state transition occurs. A 1-
to-3 base spacer sequence that is immediately to the right of the Fok I recognition sequence
beautifully encodes multiple states. For details, please refer to the original paper [1] and a review
in Japanese [2].
State transition reaction in a DNA automaton. N can be any base and each vertical line
indicates a base pair. A Fok I recognition sequence is boxed. Reprinted from Fig. 5.9 in [2].
Whiplash PCR
Hagiya et al. employed an extension reaction by a DNA polymerase that occurs when the 3’
terminus of a DNA hybridizes to a complementary sequence in the same molecule and serves as
primer, to devise a finite state machine that encoded stepwise state transition rules in a single
DNA molecule [3]. Usually, a DNA extension reaction happens in the presence of dNTP of all four
Chapter 3 What Does Design of DNA Molecules Make Possible? 117
bases in a reaction solution. If one of the bases has been depleted, the extension reaction halts
at a base complementary to it. This “polymerization stop” allows a base length that encodes a
state to undergo the extension reaction at a time.
Polymerization Stop. The reaction solution is depleted of dTTP. The extension reaction halts at
a base A in the template DNA.
The 3’ terminal sequence represents the “current state” of the finite state machine. The addition
of a new sequence to the 3’ terminus by the extension reaction corresponds to a state tran-
sition. When the added sequence hybridizes to another complementary sequence elsewhere
in the same molecule, it causes another state transition, executing a stepwise state transition
autonomously. This looks like whiplashing, hence the name “whiplash PCR”. In this reaction sys-
tem, the hybridization between complementary DNA sequences is an intramolecular reaction,
the reaction rate of which does not depend on DNA concentration, unlike in an intermolecular
hybridization reaction. Also, a plurality of finite state machines with different sequences can
execute a staged computation in parallel in a single reaction vessel. Komiya et al. have shown
that a benchmark problem of a small scale like Adleman’s can be solved using Whiplash PCR
with a limited number of cycles of the experimental procedure [4].
genetic algorithms, as well as in-vitro selection, under the title of “From Search to Optimization”,
are also discussed briefly.
For a digraph (a graph with directed edges) as shown above, the directed Hamiltonian path
problem asks to find a path that travels from a given start node to a given goal node, visiting
every node exactly once (“Hamiltonian path”).
In general, a problem that asks a solution that meets given requirements from a huge number
of candidate solutions is most often classified as an NP-complete problem. In an NP-complete
problem, in short, upon an increase in the problem size (the number of nodes and edges in the
case of a graph), the number of candidate solutions exponentially increase, so that the solution
search becomes extremely difficult. Of course, the directed Hamiltonian path problem is NP-
complete.
In classical DNA computing, in order to solve a search problem, individual candidate solutions
are represented by DNA molecules of different sequences, and all candidate solutions are
generated as a library of DNA molecules. Specifically, a large DNA library of different sequences
is prepared in a test tube. The entire library is then subjected to a series of selection steps to
identify a DNA molecule that meets the requirements of the solution, thus solving the search
problem.
Chapter 3 What Does Design of DNA Molecules Make Possible? 119
By chemically synthesizing node DNA molecules and directed edge DNA molecules and anneal-
ing them in a test tube, a variety of double strand DNA molecules representing paths in a graph
can be obtained, as shown in this figure. This process corresponds to random generation of
paths in the graph. Here, neighboring DNA molecules can be covalently joined by means of
ligases.
Assume that you have obtained a sufficient number of DNA molecules, so that all paths in a
graph, especially those as long as (or shorter than) the Hamiltonian path, have been generated.
In this case, you will only have to choose a path that meets the requirements of the solution
(Hamiltonian path) from the paths on the graph that have been generated as DNA molecules.
In order to solve a Hamiltonian path problem, Aldeman performed a selection based on PCR,
electrophoresis, and magnetic beads. He used PCR primers that represented the start and the
goal, respectively, to select/amplify only paths that led from the start to the goal. From these,
he selected paths of the same length as that of the Hamiltonian path by electrophoresis. Finally,
using magnetic beads, paths that contained all nodes except the start and goal were selected.
This operation is equally applied to all DNA molecules present in the test tube. Because each
DNA molecule represents a path in a graph, such an operation is called “data parallelism”,
meaning that it is applied equally to all data. Thus, Adleman’s method can be regarded as
Chapter 3 What Does Design of DNA Molecules Make Possible? 120
DNA Computing
In computer science, the field of research to develop novel computation models inspired by
natural phenomena is called “natural computing”. “DNA computing” started when the computer
scientist Adleman, the inventor of the RSA cipher, solved a traveling salesman problem using
DNA reactions (see section 35)[3]. After having succeeded in assembling DNA nanostructures
(structural DNA nanotechnology) and moving them (dynamical DNA nanotechnology), scientists
are aiming at constructing intelligent molecular systems (molecular robotics). In BIOMOD,
you don’t have to limit your project to DNA nanostructures. Try to exploit the potential of
biomolecules as much as possible to design an elaborate system that may not exist, or is not
yet known, in natural biological systems.
Transdisciplinary Team
As described above, successful BIOMOD projects require a transdisciplinary team. Because DNA
and protein are biological macromolecules, you need knowledge in chemistry and macromolec-
ular physics to experiment with and analyze them in depth. If you want to design a molecular
robot, you will need some knowledge in mechanical engineering and programming. If you are
familiar with electronics, you might be able to develop a novel measurement technique. In
addition, a broad range of knowledge in biology, medicine, and computer science will help you
identify an exciting project. However, it is impossible to learn all these things in such a short
period, which is why you need a diverse team. Team members must communicate to support
each other and work in an organized manner, putting the right people in the right places. Last
but not least, most Japanese students seem to have some trouble with their English skills. Do
not forget to give yourself language education, so you can work worldwide as a researcher or
engineer.
Chapter 4 Basic Textbook Knowledge 123
37. Thermodynamics
This section introduces internal energy, enthalpy, entropy, and other concepts that are required
to understand the thermodynamic properties of DNA.
The energy of a system means the ability of the system to perform work. Work is the energy
required to move something against a resisting force. For example, when a gas has expanded, it
has exerted work on its surroundings. In thermodynamics, the energy contained in the system
is called the internal energy. It has been experimentally demonstrated that work and/or heat
from the surroundings cause a change in internal energy.
internal energy of an isolated system is constant. This is the first law of thermodynamics (the
law of conservation of energy), and expressed as follows.
(w: work exerted on the system, Q: heat transfer to the system, ∆U: change in the internal energy)
Enthalpy
When the volume of a system undergoes an infinitesimal volume change dV under a pressure
p, infinitesimal work d’w from the surroundings can be expressed as follows (pressure-volume
work).
When the pressure-volume work is the only work done on the system, the infinitesimal change
in internal energy dU can be written as follows, from the equations (1) and (2).
Thus, when only pressure-volume work is exerted on the system under a constant pressure, from
the equation (3),
Chapter 4 Basic Textbook Knowledge 125
That is, the change in the enthalpy is equal to the heat transferred. In other words, enthalpy is
the energy contained in a system under a constant pressure.
Inequality of Clausius
Imagine a heat engine that receives heat Qi from
Chapter 4 Basic Textbook Knowledge 126
n heat sources to perform work w in a cyclic process, then returns to the original state. With the
temperature of each heat source being T i,
holds*. This is a mathematical representation of the second law of thermodynamics, called the
inequality of Clausius. When n → ∞,
holds for any heat engine that continuously changes its state. In the inequalities (7) and (8), the
left and right hand sides are equal when the cycle is a reversible process.
Chapter 4 Basic Textbook Knowledge 127
Entropy
Entropy S is defined as follows.
The left-hand side represents a reversible change from state A to a state B. In a cycle in which
the system is changed from state A to B through an irreversible process, and then returned to A
through a reversible process,
based on the equation (9). Thus, for any change of state in general,
holds. For an infinitesimal change, it can be written as: dS ≥ dQ/T. In an adiabatic process,
Chapter 4 Basic Textbook Knowledge 128
because d′Q = 0. This is called the law of entropy increase, meaning that any spontaneous
change in an isolated system does not cause a decrease in the entropy.
W is the number of possible microscopic states of the system in a given macroscopic state. For
example, for a given volume, the position of molecules in the system cannot be determined
unequivocally. When a space of a volume V is divided into infinitesimal spaces of a volume δV
and n molecules are placed therein, the number of possible states is: W = (V⁄ V)ⁿ. Therefore,
most simply, an increase in the number of molecules and/or the volume results in an increase
in the entropy.
The higher the entropy is, the larger W is, i.e., the more disordered the system is. Thus, the law
of entropy increase means that, in a spontaneous process, the system moves toward a more
random, disordered state.
*For detailed derivation, see section 38 [2] and [3].
—Fumi Takabatake (Tohoku University)
from the above-mentioned inequality. In other words, in a system at constant pressure and
temperature, a spontaneous change proceeds toward a minimum of the Gibbs energy. This
principle is used to measure the favorability of a reaction in a typical chemical experiment,
which is usually focused on a chemical reaction under a constant pressure.
wherein each depends on infinitesimal changes in two other state variables. Then, when G
is regarded as a function of temperature and pressure and totally differentiated, it can be
expressed as follows.
The subscripts in the derivative terms indicate the partial differentiation of G by T and p, with p
and T being constant, respectively. By comparing the equations (5) and (6),
Because S and V always have positive values, Gibbs energy is decreased by an increase in
temperature and increased by an increase in pressure.
Chapter 4 Basic Textbook Knowledge 130
When the pressure is changed from p₁ to p₂ at a constant temperature (dT = 0), the change in G
is as follows based on the equation (5).
In the case of ideal gas, wherein pV = nRT (the combined gas law),
Based on the Gibbs energy G⁰ for the state of a pure form (standard conditions) under the
standard pressure (p⁰ = 1 bar), the Gibbs energy under pressure p can be expressed as follows.
Chemical Equilibrium
In a reversible reaction, when the rate of the forward reaction and the rate of the reverse reaction
balance each other and the concentration of the substances does not change macroscopically,
this state is called chemical equilibrium.
Look at the reversible reaction below.
Therefore,
is a state variable called “chemical potential”, and this can be regarded as the Gibbs energy
per amount of substance of the component i. From the equation (12), the equation (13) can be
rewritten as follows.
Here, reaction Gibbs energy ΔrG is introduced as the change in the Gibbs energy due to the
advancement of the reaction. Based on the equation (15),
Thus, ΔrG can be expressed as the difference in chemical potential between the reactants and
the products in the reaction mixture composition. Because a spontaneous reaction proceeds
toward a decrease in Gibbs energy, the forward reaction will dominate when ΔrG < 0, whereas
the reverse reaction will dominate when ΔrG > 0. When ΔrG = 0, the reaction no longer proceeds
in either direction, resulting in an equilibrium state.
Chapter 4 Basic Textbook Knowledge 132
pi is the partial pressure of substance . Therefore, using the standard Gibbs energy change,
based on the partial pressure piᵉq of each substance in the equilibrium state. When the ratio of
the partial pressure of the reactants and products in the equilibrium state is represented by Kp,
The activity is usually a complex function of temperature, pressure, and amount of substance.
The equation is similar to that of a molar fraction or any other concentration. In particular, in
a dilute solution, it is equal to molar fraction = n�/n [1]. An equilibrium constant based on
activity
is called the “thermodynamic equilibrium constant”, based on which ΔrG⁰ = −RT lnK in a typical
solution.
In a dilute solution, the molar fraction of the solvent is substantially proportional to the molar
concentration. Therefore, by approximation, the activity is often replaced with the concentra-
tion and the following equation based on the concentration equilibrium constant Kc is used [2].
The advancement ξ, which describes the extent of the advancement of the reaction from the
initial state, is used to express the molar amount of each component at time t as follows.
Thus, the infinitesimal change in molar amount can be written as follows, relative to time.
Chapter 4 Basic Textbook Knowledge 135
When the volume V of the system is constant, the temporal change in the concentration of each
component ([A] = nA⁄V, …) can be written as follows.
v is called the reaction rate. An experimentally obtained reaction rate usually depends on the
reactant concentration, which is expressed by the following equation using a reaction rate
constant k.
This type of equation is called a “reaction rate equation”, and the sum of the exponents n = � +
ß +… is called the “overall reaction order”. The exponent of each molecular species is called the
order of reaction for that molecular species. In this case, the reaction is said to be �th in A and
ßth in B.
This differential equation can be solved easily to provide the following integrated rate equation,
with the initial concentration of A being [A]₀.
Thus, in the first-order reaction, the reactant concentration decreases exponentially with time
from the initial concentration. This equation can be rewritten as follows.
This means that if a chemical reaction has been observed experimentally or otherwise to be a
first-order reaction, the logarithm of the concentration will make a linear graph when plotted
against time, and we can obtain reaction rate constant from the slope.
Below is an example of a second-order reaction. When there is only one reactant, the reaction
rate equation is
or
When it includes two reactants, the reaction rate equation can be written as follows.
When the two reactants have the same concentration ([A] = [B]), the equation is the same as
the equation for a single reactant. When they have different concentrations, an integrated rate
equation can be written as follows.
This reversible reaction is composed of the following two elementary reactions (forward (->) and
reverse (<-)).
(1)
Chapter 4 Basic Textbook Knowledge 138
(2)
The molecule A is consumed in the reaction (1) at the rate of k₁[A][B] and produced in the
reaction (2) at the rate of k -₁[Y][Z]. Thus, the apparent rate equation of the entire reaction can
be written as follows.
This can be solved either analytically or numerically to provide a concentration equation relative
to time.
At equilibrium, where the reaction rates of the forward and reverse reactions are equal, the
following equation holds.
This means that the concentration equilibrium constant Kc and the reaction rate constant have
the following relation.
Therefore, you can experimentally observe the temporal change in fluorescence intensity (change
in the concentration of QL(overlined)) by fluorometry (see section 81) and fit it to the integrated
rate equation of a second-order reaction
A schematic of the reaction and a graph of the corresponding fluorescence intensity [1]. Reprinted
with permission from PRL, 2003.
—Fumi Takabatake (Tohoku University)
Double Helix
The double helix structure of DNA was reported by Watson and Crick in 1953 (Fig. 1)[1]. The
diameter of the double helix is about 2 nm, and its length is about 3.4 nm per turn. The double
helix contains two kinds of grooves that are different in width, with the wider one referred to
as major groove and the narrower one as minor groove. When two DNA molecules wind around
each other to form a double helix, we call this double-stranded DNA or a DNA double strand.
When they are separated from each other, each we call them single-stranded DNA or a DNA
Chapter 4 Basic Textbook Knowledge 140
single strand. A double helix usually exists as a “B form” structure, and as “A form” and “Z
form” structures only under particular conditions. These forms have different geometries and
dimensions (see the table below).
Bases
DNA is composed of four distinct bases, i.e., adenine (A), guanine (G), cytosine (C), and thymine
(T) (Fig. 2). Each base, together with a sugar and a phosphate, forms a “nucleotide” unit (Fig. 2).
A compound made of a base bound to the 1’ carbon atom in the sugar is called a nucleoside, and
an ester compound made of a phosphate bound to the 5’ position of the nucleoside is called a
nucleotide [2]. In a single strand DNA, its terminus at a 5’ carbon atom is called a 5’ end, and that
at a 3’ carbon atom is called a 3’ end. The strand is written in the 5’ to 3’ direction (see Fig. 2).
Two DNA strands in a double strand DNA run in opposite directions (antiparallel).
Chapter 4 Basic Textbook Knowledge 141
Column: RNA
RNA (ribonucleic acid) is a macromolecule that is very similar to DNA. There are two major
differences between them. First, the nucleotide sugar in RNA is a ribose with a hydroxyl group
(-OH) present on the 2’ position. Second, RNA uses a base U (uracil), instead of T (thymine) in
DNA. U and T have the same structure except for one methyl group (-CH₃) that is present only
in T.
With the initial DNA concentration (concentration immediately after mixing) being [S] = [S(overlined)]
= C and the hybridized fraction in the equilibrium state being �, the equilibrium constant K is as
follows.
Thermodynamically (see sections 37 and 38), with regard to free energy ∆G, enthalpy ∆H, and
entropy ∆S,
∆G = -RT lnK [3]
∆G = ∆H - T∆S [4]
holds (T: absolute temperature; R =1.987 x 10-³ [kcal mol-¹ K-¹]: gas constant; ln: loge ). From this,
Here, the temperature at which the ratio of hybridized DNA molecules to single-stranded DNA
molecules is 1:1 is called the melting temperature (Tm). Because � = 0.5 and T = Tm, the equation
[5] can be transformed into
.
If the initial concentration C and the thermodynamic parameters ∆H and ∆S are known, Tm
can be calculated. Tm is a widely used measure of stability of DNA hybridization. A higher Tm
indicates a higher stability of hybridization.
hybridization, but still can be handled similarly. Imagine a single strand DNA S and its hairpin
structure state H. In the two-state model,
{width=”3.75625in” height=”0.5399617235345582in”}
With the initial DNA concentration being [S] = C and the fraction of the hairpin form in an
equilibrium state being �,
{width=”3.75625in” height=”0.5465037182852144in”}
Similarly,
∆H and ∆S for each pair (Table 1) are summed up to obtain the ∆H and ∆S of the entire DNA
hybridized. The effect of strand termini can be calculated using GC-term and AT-term in Table 1.
Thus,
Similarly,
∆S = -0.4827 [kcal x mol-¹ x K-¹]
When the initial concentration C = 0.5 mM,
Tm = 345.48 [K] = 72.3 [°C]
from the equation [6]. Note that this is a value in 1 M Na+. The salt concentration dependence
of Tm is due to the salt concentration dependence of ∆S. Empirically,
∆S([Na+ ]) = ∆S(1 M)+ 0.368N ln([Na+ ] M) [10]
where N is the number of stacks (base length: -1) [2]. For example, in 0.1 M Na+,
Tm = 332.80 [K] = 59.6 [°C]
because
∆S(0.1 M) = ∆S(1 M)+ 0.368 × (23-1) × ln(0.1)
= -0.50225 [kcal mol-¹ K-¹]
That is, a lower salt concentration results in a lower Tm.
Last, we discuss a structure with an unpaired base (mismatch) such as a hairpin loop and a
bulge loop. In this case, the NN method is not enough; we need a parameter to account for the
mismatch. For example, for a loop made of n bases, the following holds (Jacobson-Stockmayer
equation).
∆H = 0
Chapter 4 Basic Textbook Knowledge 147
Fig. 1. Hybridized DNA. Each vertical line indicates base pair formation.
Table 1. Thermodynamic parameters (1 M Na+)[2].
Sequence ∆H (kcal•mol-¹) ∆S (cal•mol-¹•K-¹)
GC/CG -9.8 -24.4
CG/GC -10.6 -27.2
GG/CC CC/GG -8.0 -19.9
CA/GT TG/AC -8.5 -22.7
GT/CA AC/TG -8.4 -22.4
GA/CT TC/AG -8.2 -22.2
CT/GA AG/TC -7.8 -21.0
AA/TT TT/AA -7.9 -22.2
AT/TA -7.2 -20.4
TA/AT -7.2 -21.3
GC-term 0.1 -2.8
AT-term 2.3 4.1
Symmetry correction 0 -1.4
a system capable of dynamic changes [1] (see section 23); this technique is based on the DNA
strand displacement reaction (DSD) shown in Fig. 1. It is also called a “toehold-mediated strand
displacement”.
In the initial state, the green and red DNA strands form a double helix. When the blue single
strand DNA is inputted, it forms a double helix with the red DNA, releasing the green DNA as a
single strand. The number of hydrogen bonds between complementary bases is greater after
the reaction, indicating that the post-reaction state is more favorable in terms of free energy
(see section 38).
Fig. 2. Intermediate states in the strand displacement reaction with the branching point wan-
dering to the left and the right, which is called branch migration.
be expressed by a chemical reaction equation. The reaction rate constant (k) can be changed by
manipulating the length (n) of the toehold sequence a. According to [2], when the sequence b
to be displaced is 20 base long in 11.5 mM MgCl₂ at 25˚C, the following equation can estimate
an approximate value of k.
As understood from the equation, when the toehold sequence is lengthened by 1, 2, …, 6 bases,
the value of the rate constant increases exponentially, by 10, 100, …, 1000000-fold. Above 6
bases, it reaches a plateau. For example, when the initial concentration of two molecules in
the left hand side is 10 nM, the time necessary to consume 50% of the starting material can be
adjusted by orders of magnitude, eg. 3 minutes for 6 bases, 30 minutes for 5 bases, 5 hours for
4 bases…. This property is important when applying the strand displacement reaction to more
complex systems (see section 28).
For more detailed analysis, the reaction can be divided into three steps: hybridization (two-
molecule reaction), denaturation (single-molecule reaction), and branch migration [2]. For
computer-aided analysis and design of the reaction, an online simulator is available (see section
67). To experimentally determine the exact rate constant, the reaction can be subjected to
fluorometric time-lapse observation in a test tube (see sections 51 and 81).
-Dangling end: a single-stranded region at the end of a DNA strand, with no hydrogen bond.
-Hairpin loop: a hairpin-like loop structure. Includes only one base pair.
-Stacking pair: consecutive base pairs.
-Bulge loop: a structure that contains a loop in only one of the two DNA strands. Includes two
base pairs.
-Internal loop: a structure that contains a loop in each of the two DNA strands. Includes two
base pairs.
-Multibranch loop: a loop from which three or more helices exit. Includes three or more base
pairs.
Although not depicted in the figure, DNA can also form a “pseudoknot” structure. For details of
these structures, see [1] and other texts.
One is to predict the structure of a given base sequence in a solution and its stability, which
is called “secondary structure prediction” of DNA (see section 63). Another is to find a DNA
sequence that will form a desired structure, which is called “sequence design” of DNA (see
section 64). Both techniques are basic to and essential for controlling chemical reactions of DNA.
assembly, you will only have to consider its secondary structure. For most proteins, however,
you will have to take their tertiary and/or higher structures into account.
By analogy with parts of a robot, you can understand why most people use DNA to make
programmable biological parts, instead of proteins.
Parts for a robot or a similar system must be designed to work exactly as expected, but the
structure of a protein is very difficult to predict, seriously hindering designing one.
In the future, however, a day might come when we can overcome this technological limitation
and design any desired protein and accurately predict and control its behaviors. If so, it will
lead to extremely useful biological robots, because proteins have diverse functions that are
not possible with DNA.
Phosphoramidite Method
The most common method to chemically synthesize DNA is the solid-phase phosphoramidite
method. In this method, the starting material is the first base of a desired DNA that is linked to
a solid phase carrier (controlled pore glass column) through an ester bond. To its 5’ end, then,
nucleic acid monomers are coupled one base at a time. These “phosphoramidite” monomers,
named after a phosphate diester precursor moiety contained in their chemical structure, have
protecting groups introduced on the base moiety, the 5’ hydroxyl group of the deoxyribose, and
the phosphoramidite moiety.
The phosphoramidite method is composed of a total of six reaction steps.
Chapter 4 Basic Textbook Knowledge 158
Michaelis-Menten Equation
The reactants A and B are called substrates (S) of the enzyme (E). With the product C being P, the
rate equation of the reaction can be as follows.
equation 2
If the first step (E + S <-> ES) in the reaction readily reaches chemical equilibrium, the second
step (ES -> E + P) is the rate-limiting step in the reaction. When the dissociation constant Ks =
[E][S]/[ES] (equation 1) and the total enzyme concentration [E]₀ = [E]+[ES] (equation 2), equation
2 can be substituted into equation 1 to provide the following equation.
Chapter 5 Techniques/Materials to Boost DNA 163
equation 3
equation 4
Equation 3 can be substituted into equation 4, with Vmax = k+₂[E]₀, to provide the well-known
Michaelis-Menten equation, as follows.
Chapter 5 Techniques/Materials to Boost DNA 164
equation 5
Equation 5 holds true when the second step (ES -> E+P) is the rate-limiting step (i.e., k-₁ >> k+₂).
You can derive a similar equation assuming a steady state.. In that case, Ks in the equation 5 is
substituted by Km as follows.
equation 6
If k-₁ >> k+₂ in the equation 6, then k+₂ can be ignored so that Km\= Ks.
The energy change dG⁰ in the first step (E+S <-> ES) takes a negative value, and is related to
Km (see above). After that, the energy change dG‡ in ES -> ES‡ takes a positive value, and is
related to k+₂ (see above). An enzyme can stabilize the transition state ES‡ (i.e., lower the energy)
to promote the reaction. As a result, the sum of the two energy changes (dGT‡ = dG⁰ + dG‡) is
important for the overall reaction rate, and its value can vary greatly among different enzymatic
reactions.
Polymerases
This family of enzymes synthesizes a nucleic acid macromolecule, such as DNA or RNA, using
DNA or RNA as a template.
DNA polymerase, which synthesizes DNA, can add a nucleotide to the OH group (hydroxyl group)
at the 3’ end of a DNA strand (i.e., the nascent chain extends in the 5’ to 3’ direction). It requires
a “primer” sequence to initiate synthesis.
RNA polymerases, which synthesize RNA, include those which use DNA as a template, those
which use RNA as a template, and those which do not require any template (e.g., poly(A)
polymerase, which add a polyA sequence to the 3’ end of mRNA). Some RNA polymerases can
synthesize an artificial nucleic acid, and some can even synthesize a natural nucleic acid using
an artificial nucleic acid as template.
Chapter 5 Techniques/Materials to Boost DNA 166
Restriction Enzymes
A restriction enzyme can recognize a specific sequence to cleave it. Each cleaved fragment ends
up with either a blunt end, in which the ends of both strands are cut evenly, or a sticky end, in
which one strand overhangs. The most popular ones are so-called “type II” restriction enzymes,
each of which recognizes a palindrome sequence (i.e., a sequence that reads the same on one
strand in the 5’-3’ direction and on its complementary sequence in the 5’-3’ direction).
Nucleases
This family of enzymes hydrolyzes the phosphodiester bond between the sugar and the phos-
phate. An endonuclease cleaves a nucleic acid sequence in the middle (endo) of it, whereas an
exonuclease cleaves it from its end (exo).
Ligases
A ligase connects the 5’ end of a nucleic acid molecule to the 3’ end of the same or another
molecule through a phosphate diester bond. These enzymes can cut and sew polynucleic acids.
Hisashi Tadakuma (Kyoto University)
DNA Polymerase
Enzymatic reactions are important to build for nucleic acid building blocks because they can di-
rectly manipulate/process DNA/RNA strands of interest. There are many examples of enzymatic
reactions producing a desired design, e.g. in the implementation of a DNA computing reaction
or the preparation of a DNA hydrogel. Polymerase enzymes are in charge of “copying”, which is
particularly important for bioengineering. The structures and mechanisms of polymerases have
been extensively studied from a purely scientific viewpoint, but are not discussed in this section.
By definition, DNA polymerase is an enzyme that extends (polymerizes) DNA. Along a template
DNA, it synthesizes the complementary DNA sequence in the 5’-3’ direction, which starts from
a primer DNA that has been hybridized to part of the template. During this process, one of
the dNTPs (deoxynucleoside triphosphates) in the solution, i.e. dATP, dTTP, dCTP, or dGTP, is
consumed as a substrate per corresponding base.
Chapter 5 Techniques/Materials to Boost DNA 167
Column: PCR
The below PCR technique is among the most popular DNA polymerase-based reactions, (see
the figure below); it can amplify any desired portion of a double strand DNA. First, the double
strand DNA is heated in an aqueous solution into single strand DNAs. The solution includes
a large amount of primers that specify the portion to be amplified (green and red arrows in
the figure) in advance, which undergo hybridization as a result of annealing. In this state, DNA
polymerase starts extension to synthesize mutually complementary DNA strands (light green
and light red). After that, it is heated back into the single strand DNA state and subjected to
annealing again, which allows the primers to hybridize to the synthesized DNA, to initiate
another extension reaction. By repeating this cycle, a double strand DNA solely composed of
the desired portion is replicated exponentially, as indicated by yellow in the figure. Thanks to
thermostable DNA polymerase, you no longer have to add enzyme after each cycle.
Chapter 5 Techniques/Materials to Boost DNA 169
RNA Polymerase
RNA polymerase is an important enzyme that serves as a central component of transcription
machinery in vivo. In particular, DNA-dependent RNA polymerase uses a DNA template to
synthesize its complementary RNA strand.
One representative RNA polymerase is so-called T7 RNA polymerase. As shown in the figure
below, T7 RNA polymerase uses NTPs (nucleoside triphosphates) as substrate to synthesize RNA
from a double stranded DNA template, proceeding from a specific “initiation sequence” called
T7 promoter sequence toward the 3’ direction. This enzyme is commonly employed in the field
of DNA/RNA nanotechnology and DNA computing, e.g. in reaction systems such as RNA tiles [3,
4] and RTRACS [4].
When you use a restriction enzyme, you must pay attention to the salt concentration. The
activity of a typical restriction enzyme is dependent on monovalent ion (K+, Na+, Cl-, OAc, etc.),
divalent ion (Mg²+ etc.), and/or pH. From optimized buffer solutions available from enzyme
manufacturers, you can choose one appropriate for the enzyme to be used. When you have to
use more than one enzyme in combination, you can refer to buffer compatibility charts and apps
Chapter 5 Techniques/Materials to Boost DNA 171
provided by the same manufacturers. Recombinantly improved enzymes that can function in a
single buffer solution have also been developed.
Even when following the manufacturers’ instructions, you need to monitor the enzymes care-
fully. Each enzyme is shipped in a 50% glycerol solution for storage purposes, and too much
glycerol impairs the recognition specificity of the enzyme, inducing off-target cleavage (“star
activity”). There are commercially available recombinant enzymes that have less star activity.
]
Azobenzene Photoswitch [1]
This artificial base, developed by Hiroyuki Asanuma (Nagoya University) and Xingguo Liang
(Ocean University of China) et al., can photo-switch DNA double helix formation/dissociation.
Chapter 5 Techniques/Materials to Boost DNA 173
When this azobenzene residue absorbs UV light with a long wavelength (300 to 400 nm), it can
photo-isomerize from the trans form (top) to the cis form (bottom). Upon absorption of visible
light of >400 nm, it can photo-isomerize from the cis form to the trans form. The standard trans
form has a planar structure, and thus stabilizes the DNA double helix when inserted between
base pairs in DNA. The cis form has a slightly twisted, bulky structure due to steric repulsion
between the benzene rings, and thus greatly destabilizes the DNA double helix. This difference
allows the selective dissociation of a DNA double helix controlled by UV irradiation and its
Chapter 5 Techniques/Materials to Boost DNA 174
selective formation by visible light irradiation. Because the isomerization efficiency is below
100%, it may not be effective to insert only one moiety. Amazingly, multiple moieties, with one
inserted after every two bases, greatly stabilize the double helix in the trans form, and greatly
destabilize it when isomerized to the cis form. If financially feasible, you should include multiple
moieties.
When it is incorporated into one of the strands, it can be covalently bonded to T in the comple-
mentary strand upon one second of 366 nm UV irradiation. In order to separate them, you only
Chapter 5 Techniques/Materials to Boost DNA 175
have to expose them to 312 nm light. To make the connection more efficient, the base T in the
complementary strand should be in the position facing CNVK or in a position next to that.
]
Photocleavable PC Linker [3]
Another photo-controllable artificial base is the photocleavable (PC) linker, which allows you to
cleave DNA by means of light.
p117-3.gif
When this linker is inserted in the backbone of DNA, you can readily cleave the backbone by
just exposing it to long wavelength UV (> 300 nm) from a transilluminator or portable UV lamp.
After the reaction, the ends of both fragments, i.e. the fragments 3’ and 5’ to the PC linker, have
a phosphate monoester. Accordingly, you cannot ligate the 5’ fragment to any other fragment
unless you dephosphorylate it.
]
Universal Bases dK and dP [4]
Chapter 5 Techniques/Materials to Boost DNA 176
These artificial bases have been developed to take advantage of the tautomerism of nucleic acid
bases to allow base pairing with more than one type of base.
p118-1.gif
dK is an analog of A and G, and can form a stable base pair with either of T and C. Likewise, dP
is capable of base pairing with either of A and G.
]
Artificial Base Pair isoG-isoC [5]
This world’s first artificial base pair was developed by Benner et al. (formerly of Florida Univer-
sity) as a third base pair to add to A-T and G-C pairs.
Chapter 5 Techniques/Materials to Boost DNA 177
p118-2.gif
Compared to naturally occurring G and C, the positions of the amino and carbonyl substituents
are all inverted. They are not recognized well by enzymes such as polymerases, but will suffice
to serve as a third base pair in ordinary DNA computing systems.
]
Artificial Base Pair Ds-Pa [5]
Below is another third type of base pair developed by Ichiro Hirao et al. (RIKEN).
Chapter 5 Techniques/Materials to Boost DNA 178
p118-3.gif
Surprisingly, this only relies on shape complementarity and not on hydrogen bond in base
pairing, and can still be recognized by enzymes and correctly replicated.
{width=”1.3452088801399824in” height=”0.7768755468066492in”}
Chemical structure of biotin. It can be conjugated to a different molecule through the carboxy-
late side chain.
Biotin has many merits in DNA nanotechnology. For example, it is free of nonspecific adsorp-
tion onto mica, and can be observed as a bright 5 nm dot, which is taller than DNA. Our
laboratory is so fond of this protein that there is a proverb: “Call streptavidin if you need help”.
Chapter 5 Techniques/Materials to Boost DNA 180
{width=”0.9441666666666667in” height=”0.8616666666666667in”}
Streptavidin (bright dot) caught by closed DNA pliers.
pairing. In “Hoogsteen” base pairs, the 2 and 7 positions participate in base pairing, wherein T
binds to an A-T pair and protonated C+ binds to a G-C pair.
Chapter 5 Techniques/Materials to Boost DNA 182
DNA Triplex
Chapter 5 Techniques/Materials to Boost DNA 183
As seen from the figure above, only pyrimidines are capable of Hoogsteen bonding with normal
Watson-Crick base (purine-pyrimidine) pairs. In line with this, naturally occurring DNA triplexes
have been discovered that are each formed by Hoogsteen bonding between a homopurine-
homopyrimidine region in a double helix (a sequence of consecutive purines or pyrimidines
in one strand, such as AAAAAAAA/TTTTTTTT) and a third homopyrimidine strand (TTTTTTTT),
which have been implicated in the regulation of gene expression.
Of the two kinds of triplets (triplet = a combination of three bases), the T-A-T triplet can form
under any conditions. The C+-G-C triplet can only form under acidic conditions, because it
requires protonated C+.
In general, the binding of the third strand is less stable than the formation of the duplex. In the
UV melting curve of such a triplex (see section 82), two-step melting is observed.
Because a DNA triplex is not very stable, you may not be willing to use it in your BIOMOD project.
Anyway, a sequence that includes consecutive purines or pyrimidines is not recommended in
DNA sequence design.
G quadruplex
Naturally occurring DNA quadruplexes [2], composed of even more strands, have also been
discovered. The most famous one is the guanine quadruplex.
Chapter 5 Techniques/Materials to Boost DNA 184
This is derived from a G quartet structure that is formed by Hoogsteen bonding among four
guanine bases surrounding a coordinated metal ion. The metal ion trapped in the center is
usually Na+ or K+ (K+ provides a stronger bond).
In general, a sequence with three or more consecutive G bases tends to make this structure.
In vivo, especially the telomere region at each end of a chromosome is enriched in G-repeat
sequences (in the case of human genome, TTAGGG is repeated over and over). It has been
proposed that possible quadruplex structure formation in this terminal single-stranded region
may contribute to the stability of the chromosome itself.
In principle, a typical guanine quadruplex is more stable than a DNA double helix. In sequence
Chapter 5 Techniques/Materials to Boost DNA 185
design, you should avoid three or more consecutive G bases. If a single strand has too many
G-repeat sequences, it might be folded in a complex manner into a stable intramolecular
quadruplex. For example, a strand with four repeats of a sequence, (TTAGGG)₄, is well known
to be folded in diverse fashions to form an intramolecular quadruplex.
Conversely, you can make use of a guanine quadruplex to make a DNA-based molecular ma-
chine that can be triggered by Na+ and/or K+. For example, it’s possible to make a single
molecule metal ion sensor, made of DNA pliers that detect Na+ and/or K+, close by introducing
a plurality of (TTAGGG)₂ into the levers in the pliers (see section 26).
i-Motif
The so-called i-motif[3] structure is another DNA quadruplex.. This structure does not involve
Hoogsteen bonding, but is derived from mismatched base pairing between C and protonated
C+.
Chapter 5 Techniques/Materials to Boost DNA 186
Chapter 5 Techniques/Materials to Boost DNA 187
One C-C+ base pair only results in a double strand. The i-motif is formed as a very unique
structure; i.e., two double strands each formed by the C-C+ base-pairing align with each other
in a head-to-tail orientation, through mutual intercalation of the hydrogen bonds in the two
double strands.
Chapter 5 Techniques/Materials to Boost DNA 188
Chapter 5 Techniques/Materials to Boost DNA 189
Like the C+-G-C triplet, this requires the protonation of C, which means that an i-motif can be
formed only under acidic conditions. Accordingly, several i-motif-based, pH-responsive DNA
molecular machines have been reported.
—Akinori Kuzuya (Kansai University)
(A) The use of a biotin-streptavidin interaction to observe the displacement of a myosin head
Chapter 5 Techniques/Materials to Boost DNA 191
moving along an actin filament. (B) Step length observed by the needle. An average sub-step of
5.3 nm was observed. Reprinted with permission from Nature, 1999 [2].
The “hand-over-hand” model has been proposed to explain the mechanism of kinesin moving
along a microtubule. In this model, kinesin alternately moves two heads to “walk” like a human
(FIG. 4). In the solution, each kinesin head has ADP bound to it but, upon microtubule binding,
one of the heads dissociates from ADP and binds to ATP. Then, ADP dissociation from another
head is promoted. In this manner, the two heads alternately repeat ADP dissociation, allowing
the stepwise movement of kinesin along a microtubule.
By attaching a fluorescent dye to one motor domain of kinesin and subjecting it to a single-
molecule measurement, researchers have demonstrated that the motor domain achieves an
average step length of about 17 nm and that kinesin moves hand-over-hand.
Chapter 5 Techniques/Materials to Boost DNA 193
Hand-over-hand model
Fluorescence
Fluorescence is emitted during the process of a molecule being excited from its ground state
(S0) to an excited state (S1) and then returning to the ground state, releasing energy. This
process can be illustrated by a Jablonski diagram. The time scale of the entire phenomenon
is in nanoseconds, wherein the transition to the excited state occurs on a scale of femto (10-
¹⁵) seconds, and the emission of fluorescence occurs on a scale of nano (10-⁹) seconds. Due
to thermal vibration etc., the energy of the fluorescence is lower than that absorbed by the
fluorescent dye (i.e. the wavelength shifts to a longer wavelength), which is called a Stokes
shift. Because the properties of the fluorescence (wavelength and half-life) can be affected
by the motion of the fluorescent molecule and its interaction with its surroundings, various
fluorescence spectroscopy techniques have been developed.
Chapter 5 Techniques/Materials to Boost DNA 195
Principles of FRET
A fluorescent molecule (donor) emits energy as fluorescence and returns to the ground state.
When another fluorescent molecule (acceptor) that can receive the energy is present in close
proximity, a dipole-dipole interaction occurs between the transition dipoles, and the energy is
transferred. One result is that, even when the donor is excited, the energy is transferred to the
acceptor, which (instead of the donor) then emits light. The probability of the energy transfer
depends on the relative orientation of the donor and the acceptor. The efficiency of the transfer
is inversely proportional to the distance between them⁶. In commonly used organic fluorescent
dyes and fluorescent proteins such as GFP, energy transfer occurs at a distance of about 10 nm.
The energy transfer efficiency EFRET, which is a measure of the energy transfer, can be ex-
pressed as
Chapter 5 Techniques/Materials to Boost DNA 196
equation 1
with R being the distance between the two fluorescent dyes, ID being the fluorescence intensity
of the donor, and IA being the fluorescence intensity of the acceptor. R₀ is the distance at which
EFRET is 50% (Förster distance), calculated as follows.
equation 2
equation 3
Introduction
Unlike a programmable artificial life model in a computer, artificial life in the real world is
difficult to realize. Practically, all you can expect is to reconstitute parts of the various functions
of a cell. A simple set of molecules can be introduced into a microcapsule as a model of a cell,
to produce a so-called artificial cell [1]. A lipid membrane vesicle, i.e. liposome, is often used as
a carrier.
Chapter 5 Techniques/Materials to Boost DNA 198
Lipid Membrane
Lipid molecules are amphiphilic; each contains both hydrophilic and hydrophobic functional
groups. When dispersed in water, they assemble into a structure that hides the hydrophobic
portions inside. Different shapes of lipid molecules can produce different structures, including
a sphere (micelle), rod (micelle), film (membrane), and plate (membrane). The field of research
studying these assemblies of molecules based on non-covalent weak interactions is called
supramolecular science (chemistry). In addition to macromolecular chemistry, you should learn
about supramolecular chemistry if you want to win BIOMOD [3]. Well-known examples of
artificial lipid membranes include black film, LB membrane, water-in-oil (w/o) emulsion, and
liposomes (FIG. 1).
Liposome
A liposome is a closed vesicle made of a lipid bilayer membrane, first reported in 1964 by
Bangham. The lipid membrane is an excellent hydrophobic barrier to enclose and separate the
inside microvolume from the outside. It has a lipid bilayer structure similar to a cell membrane,
and thus its application to the biological membrane model, cosmetics, DDS (drug delivery
system [4]), and artificial cells has been extensively studied. A liposome can be classified as
SUV, LUV, MLV, or GUV depending on its size and membrane structure. There is also growing
interest in polymersomes, which are made of amphiphilic macromolecules (polymers), because
of their robustness and ease of functionalization. Since Hotani et al. succeeded in the direct
observation of GUV by dark field microscopy in 1980s, high-resolution observation techniques
using fluorescently labeled molecules to look at cell-sized liposomes have also been developed
[1]. There have been a number of attempts to install a biochemical reaction system inside a
liposome, and monitor/assess it. Thanks to advances in microprocessing technology and other
techniques, it has become much easier to prepare cell-sized liposomes of a uniform size (see
section 89).
Chapter 5 Techniques/Materials to Boost DNA 199
Artificial Cell
Attempts to assemble an imitation of a cell or even a true cell from substances, together with
questions about the origin of life and xenobiology, have produced a wide range of models,
the most famous of which are those by Traube and Oparin. Artificial cells can be used to solve
the mystery of the asymmetry of a cell or how cells accomplish one-way traffic at certain
times. It is possible to operationally clarify how the substances and their systems behave by
observing an artificial, “cell-like” environment instead of a cell. For example, researchers have
already achieved a simplified protein synthesis system [5], mimicry of Darwinian evolution by
incorporating spontaneous mutation into RNA replication [6], synthetic lipid-based coupling
of membrane replication with DNA replication [7], a mycoplasma cell with a fully synthetic
genome (dubbed as Synthia) [8], and protein synthesis by a liposome that contains E. coli
extract, which aims at manufacturing a whole cell [9]. In Japan, the Japanese Society for Cell
Synthesis Research was launched in 2005 [10].
Concluding Remarks
DNA, lipid membranes, liposomes, cells, and organisms are all tangible, i.e. engineerable en-
tities, but “life” is just a word and a concept. What kind of system is needed to produce an
artificial cell that surpasses life? This is an interesting question to think about. It might be that
the engineering technology of mankind leaves much to be desired as long as artifacts belong to
a subset of life. Feynman asked how to put a toothed gear meshing with another into words.
Let’s try to create something hard to express in words, taking advantage of this engineering
opportunity [12].
—Shinichiro M. Nomura (Tohoku University)
Chapter 6 Software Techniques
53. Basic Software Programs: Overview
Because “design” of a biomolecule, the subject matter of the BIOMOD competition, is invisible
to the naked eye, you must demonstrate the results of your project in a readily understandable
manner. Your scores, therefore, are based on your Wiki, YouTube, and presentation. Your docu-
ments, tables, graphs, and presentation slides, including animations and videos, should be high
quality and effective.
Office Suite
You can use Microsoft Office to prepare all your materials (see section 3). Microsoft Office is an
office suite (a collection of business software) provided by Microsoft; the regular suite contains
Word (document preparation), Excel (spreadsheet), and PowerPoint (presentation). Another
example of an office suite is iWork, provided by Apple Inc., which includes Pages, Numbers, and
Keynote.
The graphic tools in PowerPoint allow you to not only draw figures but to adjust the brightness,
contrast, and tone of an image. However, using Illustrator (graphic design software) and Photo-
shop (photo editor), provided by Adobe, will improve the visual elements of your materials (the
Chapter 6 Software Techniques 203
illustration below was drawn on Photoshop). While these two programs are payware, there are
free applications that will also work, such as Inkscape and GIMP.
3D images are often necessary to make your Wiki, YouTube, and presentations easier to under-
stand. There are a wide variety of 3D CGI applications, but all of them involve modeling, material,
animation, and lettering, which are described below.
Image-J is a popular processing/analysis application for creating image data from wet experi-
ments. In spite of being free and open-source, it performs well and has advanced functions that
can even satisfy professional researchers. It supports a variety of file formats ranging from TIFF,
PNG, JPEG, BMP, AVI, and other basic ones, to uncommon formats. You should master it as a
powerful tool not only for use in BIOMOD but also for your graduation thesis, master’s thesis,
and other researches (see section 58).
Chapter 6 Software Techniques 204
According to the judging criteria in previous BIOMOD competitions, the YouTube video is 25%
(see section 3) of your score. Thus, video making skills are key. The commercial software Pre-
miere Pro and the freeware programs such as Windows Movie Maker and AviUtl are introduced
below.
Chapter 6 Software Techniques 205
Document Preparation
When your project starts, you will discuss the schedule and roadmap (a chart that lists tasks
necessary to achieve your goals in a timeline) with other team members. In the course of the
project, you will go through many meetings to discuss the design specification, the status of
each member’s task, and so on (see section 7). You should make full use of document prepa-
ration software such as Word to make the group work more efficient and effective. In business
settings, meeting documents and proceedings are often prepared in Word and electronically
distributed. Similarly, you should make use of document preparation software as a tool for
efficient group work.
Chapter 6 Software Techniques 206
Spreadsheet
Many of you have probably used Microsoft Excel, a program that creates spreadsheets. For
BIOMOD, Excel can help manage data management/analysis, and can also output results as
tables and various graphs.
For data management, its “Data” menu provides Sort, Filter (data extraction), Remove Dupli-
cates, Subtotal, PivotTable, and other basic functions which allow the import of external data
sources. It can also help the management of your group work. For example, if the project is
divided among the team members, you can create a table of the tasks of individual members,
share the Excel file on the web, and have each member update the status of her/his task in the
table, to see the overall status of the project at a glance. Like Word, just one spreadsheet can be
a powerful tool to dramatically improve the efficiency of your group work when combined with
IT.
Chapter 6 Software Techniques 207
In relation to data analysis, the use of various Excel functions allows basic statistical analyses,
e.g. mean and variance, and more complex statistical analyses, e.g. test of a difference between
population means, nonparametric test, correlation, and multivariate analysis. There are many
textbooks on the use of Excel in statistical analysis and multivariate analysis. For examples, see
reference [1].
If you want to combine Excel functions to perform complex calculations/tasks efficiently (a
series of tasks automatically), it can be greatly facilitated by “macro” (“Tool”-“Macros” in the
menu). You can write a macro using the programming language VBA (Visual Basic for Applica-
tions) for Excel. Interested readers are referred to reference [2] and other guides.
Last but not least, tables and graphs can be easily created by selecting “Insert”-“Table”, “Graph”
in the menu. Particularly, you can choose an appropriate graph from column, line, pie, bar,
area, scatter, and other charts (stock, contour, doughnut, bubble, and radar) depending on your
purpose, which is an advantage of using the graph drawing tool of Excel.
Graph Drawing
You should also learn the software Gnuplot, which allows for efficient graph drawing. This free,
open-source software program with advanced functions can be regarded as a programming
language for 2D/3D graph drawing. Excel is more convenient in that it allows you to make a
graph by just following a wizard, but Gnuplot provides more functions and performs better at
processing a large amount of data quickly into a graph.
Column: TeX
Undergraduates science majors often write their graduation thesis on TeX. Scientific papers
in some field are also often written in TeX (needless to say, many are Microsoft Word users).
TeX is a free typesetting system, wherein the users prepare texts and images/tables (i.e.
contents), while the computer takes care of layout and other non-content elements of the
document based on class file (or style file) and other settings. Writing in TeX is rather like
making a webpage in the HTML language, and thus more cumbersome than writing in Word.
Chapter 6 Software Techniques 208
Nevertheless, many people are still using TeX all over the world, possibly because it can make
equations particularly beautiful.
Presentation
As already discussed in section 10, a presentation at BIOMOD is different from a typical confer-
ence presentation, in that you can add drama and entertainment to it. But the most important
thing is to make the content of your project understood, and to explain it clearly.
Presentation structure (example):
Presentation Structure
In a presentation, simply showing the results of your project is not enough. You need to empha-
size the points of your story and pace the talk effectively. For that purpose, the structure of the
presentation should also take the psychology of the audience into account. If your experiment
has been successful, then show it at the climax. If your experiment has not been successful but
has prompted a unique principle, place more emphasis on its description. In addition, arrange
the results in the best order to make them understood.
In any kind of presentation, the goals and conclusions should agree with each other. If they
don’t, the audience may not believe your argument. In addition, you can also strongly impress
the audience by talking about the future progress that can result from your results.
Chapter 6 Software Techniques 209
Read the speech script to time the presentation on a stopwatch. Adjust the number of slides and
the length of script for each part so that you can allocate enough time to the most important
points.
Synchronization between the speech script and the slides:
Terms used in the script have to completely match those in the slides. In addition, the flow of
the script for a slide must match the order of things displayed in that slide (from the top to the
bottom, from the left to the right).
Simple and clear slides:
Each slide should not contain too much information. Texts should be in a large font and brief. In
general, one slide should not contain more than 7 lines.
Use of consistent terms and icons:
Each thing should have one name, one icon, and one shape of the same coloring throughout
the slides. Since your group all are working, some of you might be using different names for the
same things. You need to make sure these are changed to reflect the same terminology. This is
where communication among the team members matters.
Graph Requires Text Summary:
A look at each slide should tell the viewer approximately what it means. Make sure each graph
of your results has text that summarizes what your graph indicates.
Last Slide:
This slide will be on the screen throughout the discussion time, and thus should include more
than something like “Thank you for your attention”, such as a summary of the results or a
positive message.
Movie and Visual Effects:
Unfortunately, a movie embedded in a slide sometimes fails to play at the BIOMOD final. Instead,
you should rely on animation wherever you can. Do not abuse visual effects during slide change.
Save them for where they are really needed.
Appoint a Slide Flipper:
Let the speaker concentrate on reciting the speech script, by appointing someone else to switch
the slides. The slide switcher should rehearse again and again in advance to learn where to
change slides in the script.
Chapter 6 Software Techniques 211
which is composed of “pixels”, is called a raster image. The letter on the right, which is composed
of “coordinates and lines”, is called a vector image.
Even if an image is drawn based on draw commands, it is not practical for you to input the draw
commands like a program. What you need is a software program that allows drawing actions of
the user to be performed intuitively, and automatically converts them into draw commands.
In a software program for vector image drawing, its “user-friendliness” is more important than
in a raster image drawing program,. The graphic software Illustrator (Adobe) is a representative
software program for the former [1].
To master Illustrator, you should learn about the terms anchor point, segment, and path (see
the figure below for an example). (1) Selecting the Pen tool and (2) clicking on the start point
will create an anchor point there. Then (3) clicking on the end point will create another anchor
point there and a segment (a straight line in this case) will appear between these two points.
To continue the drawing, (4) clicking on another point will create the next anchor point, which
will add another segment. An uninterrupted line composed of anchor points and a segment(s)
is called a path. If Illustrator is new to you, you should start from the Pen tool and proceed to
adding a new anchor between segments, connecting anchors with a smooth curve, and then
drawing a more complex curve.
Chapter 6 Software Techniques 215
{width=”4.2711122047244094in” height=”3.48125in”}
Apart from various vector images, photographs taken with camera are a kind of raster image.
Therefore, you will also need a software program to add effects to a photograph or adjust its
contrast, brightness, and tone. Among such raster images, photographs and other complex
images are usually processed with a specific kind of software, which is called a photo editor.
Chapter 6 Software Techniques 216
Photoshop (Adobe) is a photo editor [3]. You can also scan a free-hand drawing and process it
with Photoshop to have a digital illustration as follows.
Alternatively, you can use GIMP, a freeware program with basic functions comparable to those
of Photoshop.
—Takashi Nakakuki (Kyushu Institute of Technology)
57. 3D Animation
Animation can be very effective in explaining molecular reactions and mechanisms. However,
creating an animation requires as many as thousands of images and thus enormous labor. How-
ever, you can create it relatively easily with the help of three-dimensional computer graphics (3D
CGI).
Modeling
First, this section describes the implementation of a moving model (modeling) for animation.
In 3D CGI, a human character or any other object is composed of vertices, edges, and faces,
which is called a polygon. There are three modeling methodologies: polygonal modeling, which
directly edits a polygon, curved surface modeling, which indirectly edits a polygon using Bezier
and NURBS curves, and sculpt modeling, which creates an object like clay working. Polygonal
modeling is the most basic, whereas curved surface modeling is suitable for symmetric struc-
tures such as cylinders and sculpt modeling is good for making complex structures with many
Chapter 6 Software Techniques 217
bumps. Increasing the number of polygons (high poly) means you can model a more complex
structure, but this also requires more of your PC.
Material
You can change the color, material, and reflectivity of your model by changing material settings.
Texture settings are used to add surface patterns, smaller features, and other details. A texture
that carries color and similar patterns is called a color texture, and a texture with the information
on features drawn on it is called a normal map texture. Although these textures can be created
using a graphics editor, they can be downloaded from the CG Textures site and other similar
websites, too. To use a texture, a process of projecting the texture onto your model (UV mapping)
is required.
Chapter 6 Software Techniques 218
UV mapping
Animation
In 3D CGI animation, a motion is made by registering a pose of a model for each time point and
interpolating the poses. An animation made by registering the poses, or “setting key frames”,
is called a key frame animation. You can use Graph Editor to perform the motion interpolation
in detail. To move a human, animal, or any other multi-jointed model, bones (a skeleton) are
used. Each bone is subjected to “skinning” to associate it with the polygon of the model, which
allows a move of the bones to change the morphology of the polygon. Bones (a skeleton)
have a hierarchical structure: a structure in which a child follows the motion of its parent is
called FK (forward kinematics), and a structure in which the motion of a child determines the
position of its parent is called IK (inverse kinematics). In the motion of bones of hands, legs, and
other object-contacting parts, IK is usually employed. Although FK provides a higher degree of
freedom of motion, the use of IK in walking, pushing an object, and similar motions can improve
your animation. An animation is usually created by hand drawing, but it is also possible by
motion capture to generate the motion of a CG model from that of a real person or object, or
by baking physical computing results to key frames. Using these techniques, you can create an
animation that is too complex to input by hand.
Chapter 6 Software Techniques 219
Rendering
In rendering, a light source(s) is set to illuminate a model and an image is created based on the
lighting and camera settings. The overall quality of your animation is greatly affected by the
settings of rendering. Lighting usually includes key light, fill light, and back light (“three-point
lighting”). In rendering, you can use ray tracing to add reflection and refraction by tracing the
path of a light to reach the camera, global illumination to compute lights reflected on an object
(indirect light), ambient occlusion to add 3d appearance by shading gaps and hollows in the
model, and other functions to obtain high-quality graphics comparable to live action films.
2D animation and similar graphics can also be rendered by emphasizing the edges of a model,
binarizing the shades, and other techniques (cel shading).
Closing Remarks
The above are just a taste of 3D CGI creation. You can also render a fluid, the destruction of an
object, and so on by employing physical computing, Boolean, and other functions. Commonly, it
Chapter 6 Software Techniques 221
is more practical to reduce the rendering time required for retake by outputting the background
and the model in your movie separately. Output them as numbered images that support
transparency (e.g. in a PNG format), which will be combined later using a video editor. This can
also be applied to combining them with a real background.
Useful Sites for 3D CGI
- Metasequoia (modeling software): http://metaseq.net/jp/
- MAKEHUMAN (human character making software): http://www. makehuman.org/
- Sculptris (sculpt modeling software): http://oakcorp.net/zbrush/sculptris/index.php
- Maya (3D CGI software): http://www.autodesk.com/education/free-software/maya
- Blender (3D CGI software): http://blender.jp/
- BLEND SWAP (free models for Blender): http://www.blendswap.com/
- MOLECULAR FLIPBOOK (molecular animation software): http://www.coolhunting.com/tech/molecular-
flipbook
- CG textures (texture download): http://www.cgtextures.com/
—Masaru Tsuzawa and Ibuki Kawamata (Tohoku University)
There are a variety of formats digitized image data can take, and popular ones have an extension
like .JPG, .PNG, .TIFF, or .BMP. In addition, each software program may have its own format.
Uncompressed TIFF format is the de facto standard for images for data processing.
Pre-processing is Critical
In a thriller TV or movie, you may have seen a blurred face of a suspect getting sharpened
through image processing. However, image processing does not always go like that. Of course,
you can use macros to automate data processing. Extraction from an image with too much
noise will be very difficult to do. There are a variety of noise reduction algorithms (for example,
a Gaussian filter, a median filter, or a two-dimensional Fourier transform, a window function,
and then an inverse Fourier transform [*]). It is the job of the experimenter who has actually
obtained an image from an experiment to process the image into an automatically analyzable
state for data extraction.
ImageJ, in spite of its somewhat antique UI, is compatible with the latest OSs. Below is an
exemplary process of particle analysis using it.
Original image -> store in a TIFF format -> import into ImageJ: file information is provided at the
top left, such as 800 x 600 pixels; RGB; 1.8 MB.
Chapter 6 Software Techniques 223
Scaling: for example, draw a line over the 20 μm scale bar by hand, and select Analyze -> SetScale
to display Distance in pixels: 45 -> then enter Known distance: 20 and Unit of Length: μm.
Binarization: Process -> Binary -> Make Binary. Then confirm that you see black particles on a
white background. Otherwise, invert it by Select All -> Edit -> Invert.
Chapter 6 Software Techniques 224
Particle analysis: Analyse -> Analyse Particles -> input counter conditions (see below for an
example) -> OK
to start counting.
Chapter 6 Software Techniques 225
This example includes 27 particles, with the area of each particle displayed. It can be outputted
into Excel.
Instead of ImageJ, you can also use the Adobe Photoshop and the freeware GIMP [4]. For
scientific purposes, ImageJ and its wide range of macros will suffice. See reference [5] for details
and sections 56, 57, and 59 for CG/image preparation.
Universal Rules
Quality requirements may differ depending on your purpose, but will usually include no debris
(only the original experimenter can decide if something in an image is truly debris), good
contrast, less noise, good size of the object in the image, high resolution, and good tone levels.
Always try to obtain a good image that facilitates image processing.
—Shinichiro M. Nomura (Tohoku University)
Chapter 6 Software Techniques 226
Overview
Once you have set the policy on how to make your video and have the scenario and casts set (see
section 9), you actually start making the video. In general, video editing proceeds by collecting
movie, audio, and other data, editing and adding effects to them, and then setting the timeline.
As the video is nearing completion, show it to other team members and get feedback to revise
it again and again. Usually, these different steps need different software programs.
Necessary data include movies, sounds, BGM, and subtitles. The movies are prepared by CG (see
section 57) or live action. The TITECH team 2011 won second place in the Youtube category with
a movie that combined PowerPoint animation and live action. The Tohoku University team 2012
won first place on the Youtube Award with a movie that combined live action and CG. You may
also add complex effects to your movie using the software program After Effects [1] or the like.
You can record sounds and BGM, or you can use found materials. Sound editors are not
discussed here. The Arizona State University team 2014 won second place for Youtube Award
using melodies that they composed. In addition to these examples, you should refer to other
highly evaluated videos in previous BIOMOD competitions (see section 5).
Video Editing
When it comes time to do the video editing, you will further process the collected movie and
audio data, copy and paste them, and arrange them in place. There are various applications that
can assist in these processes, and you can choose the one you like. An exemplary commercial
application is Premiere Pro [2], while Windows Movie Maker [3] and AviUtl [4] are freeware. Their
execution screens are shown in FIG. 1.
Chapter 6 Software Techniques 227
Chapter 6 Software Techniques 228
In any of these software programs, timeline editing is a central part of the task. Timeline specifies
the timing of shots, sounds, effects, and subtitles. As an example, FIG. 2 shows a timeline of
about 10 seconds on AviUtl, made by the Tohoku University team in 2012.
The timeline proceeds to the right. Vertically, layers are stacked that specify materials used for
each time point. Blue indicates a picture, including subtitle. Red indicates a sound or BGM, and
green indicates an effect at a scene change.
As seen from the above, each scene of the video will be made by combining multiple types of
information, where your editing skills matter. Do it carefully, referring to the preview window.
Aviutl timeline
Chapter 6 Software Techniques 229
Scilab
Scilab can do numerical analysis, visualization, signal processing, and data analysis, and has
the following features:
Freeware
This is particularly important in BIOMOD. You can download it from the Scilab website [1] and
install it on your personal computer. It works with Linux, Mac, and Windows. There are a number
of Japanese textbooks on it.
Scilab is a programming language that allows you to carry out a series of tasks such as data
import, numerical analysis, and graph drawing in a seamless manner, has a grammar easier to
understand than that of C language, and does not require programmers to worry about data
type and memory allocation (students in my laboratory mastered its use for basic operations in
a few days).
The time course of the concentration of DNA can be expressed by the following differential
equations based on reaction kinetics (see section 39).
Chapter 6 Software Techniques 232
The figure below is the start-up window of Scilab. Type the following program line by line in the
left console. Here, the initial concentration of DNA is X(0) = 10nM, Y(0) = 20nM, and Z(0) = 0nM.
“//” in the program marks a comment. Its grammar is not discussed here, but is easy to learn if
you know the definition and use of vector.
The axes in the graph are appropriately scaled, and the DNA concentration curves are differently
colored.
Chapter 6 Software Techniques 234
Software-Assisted Design
Various systems have been developed based on the geometrical, thermodynamic, and reaction
kinetics properties of DNA molecules. As the scale of such systems become larger and larger,
risk of errors that cannot be foreseen by an empirical design methodology grow.
To circumvent such problems, use a software-assisted design program. In relation to the design
of DNA molecular systems, a basic strategy to apply different software programs to different
purposes is introduced in terms of geometry, dynamics, thermodynamics, and reaction kinetics.
Geometric Properties
Geometric properties of DNA double helix (see section 17) have been used to assemble two/three-
dimensional structures. You must consider the phase, pitch, and interatomic distance of a helix
when designing DNA systems. Applicable software programs include Namot, which can model
all atoms in DNA, Nanoengineer-1, specialized for three-dimensional modeling of double helix
structures, and caDNAno to design DNA origami structures based on crossovers (FIG. 1). You
must choose the right one depending on the size and features of your system (see section 66).
Chapter 6 Software Techniques 235
Structure design software (from the left, Namot, Nanoengineer-1, and caDNAno).
Dynamical Properties
By predicting the behavior of a DNA structure in a solution, you can assess its structural stability
and motility. For an all-atom model, an existing molecular dynamics software program, such
as NAMD, is effective. A structure file generated by caDNAno can be used in stability analysis
by CanDo and coarse-grained molecular dynamics simulation by oxDNA (see section 65). In any
instance, the Brownian motion of a DNA structure in a solution can be predicted (FIG. 2).
Dynamics simulation (from the left, results of NAMD, CanDo, and oxDNA).
Thermodynamic Properties
In order to design a DNA molecular system, it is important to predict the hybridization between
sequences, i.e. the secondary structure of DNA, rather than its three-dimensional structure. It is
also possible to predict the thermodynamic free energy and Tm from the sequence information
(see section 41), which can help to determine if an expected or unexpected secondary structures
will form. Such a measure can also be applied to an inverse problem, that is, to design a
sequence that can form the desired secondary structure [1].
Chapter 6 Software Techniques 236
Suitable software programs are NUPACK and DINAmelt (see section 62). Both allow secondary
structure and free energy prediction. NUPACK is good at designing a simple sequence and pre-
dicting the secondary structure of multiple DNA strands, while DINAmelt is good at predicting
Tm and heat capacity (FIG. 3).
When you want to design a set of DNA sequences for DNA computing or tile assembly, you should
use a software program specialized for that purpose, e.g. DNA Design, to exclude the possibility
of their hybridization in any undesired combination (see section 63).
NUPACK
NUPACK is a program developed by Dr. Robert Dirks (Caltech). For an input DNA (RNA) base
sequence, it outputs all possible secondary structure(s) in a solution, i.e. secondary structure(s)
that are stable in terms of free energy. Below is the list of the types of information submitted to
NUPACK and information returned to the user.
[Input information]
RNA or DNA?
Reaction temperature
Number of DNA strands
Maximum number of DNA strands to form a secondary structure
Base sequence and concentration of each DNA
Concentration of Na+ and Mg²+
[Output information]
Stable secondary structure(s) and its concentration in the solution
Melting profile
Probability of base pairing for each base
Advantageously, it can operate online, and has an intuitive interface and a demo program. You
should try it at least once (http://www.nupack.org/).
The figure below is a screen shot of the results of inputting an exemplary DNA base sequence,
“GGTATGCGATGAGCACACCCGCGGGTTTCGCGCGT”. According to the figure, DNA with this base
sequence is most stable in a solution when it forms a structure composed of hairpin loops and
internal loops, with a minimum free energy (see section 42) of -12.27 kcal/mol.
Chapter 6 Software Techniques 240
In addition, NUPACK can design a base sequence that can form a desired structure, which is also
available online (see section 64).
DINAMelt
DINAMelt, developed by Dr. Nicholas R. Markham (Rensselaer Polytechnic Institute), is a web-
based program with functions similar to those of NUPACK. NUPACK and DINAMelt use essentially
the same structure prediction algorithm, which is an extended version of the algorithm first de-
veloped by Dr. Zuker in 1981. Moreover, both programs use free energy parameters calculated by
Chapter 6 Software Techniques 241
Dr. Santa Lucia, which are required for secondary structure prediction, so that their secondary
structure prediction results will be essentially the same.
However, only DINAMelt is capable of detailed thermodynamic analysis. In the figure below, two
different DNA strands are reacted, and the concentration change of reaction species is plotted
against temperature.
In addition to a stable secondary structure and the free energy for it, the enthalpy, entropy,
melting temperature, and heat capacity are also calculated.
In contrast, what NUPACK can do but DINAMelt cannot is to “predict the secondary structure
made of three or more DNA strands” and “design the base sequence of DNA”. Your purpose will
unequivocally determine which program to choose.
Column: BLAST
First of all, DNA is the design of life. In order to study the genetic nature of DNA, we sometimes
want to know the similarity between two base sequences. BLAST (Basic Local Alignment
Search Tool) is a popular program in bioinformatics. BLAST can receive a DNA base sequence
and search for and return base sequences similar to it from huge databases.
Chapter 6 Software Techniques 242
For example, you can readily investigate if a human has any gene similar to a gene in monkey,
at http://www.ncbi.nlm.nih.gov/BLAST/.
Typical gene databases are too huge to compare sequences one by one in a reasonable
amount of time. Therefore, BLAST initially checks similarity by an easy and quick method to
exclude unpromising sequences from candidates. Such improvements have achieved a great
reduction in calculation time. In software programs to predict the behavior and properties of
biomolecules, both prediction accuracy and prediction time are very important factors. BLAST,
NUPACK, and DINAMelt, introduced in this section, all meet both of the requirements, and thus
are actively used by many.
The time required for DNA secondary structure prediction is proportional to the third power of
the base sequence length. For example, a 10-fold increase in base sequence length will result
in a 1,000-fold increase in computing time. Keep this in mind when you want to use NUPACK
or DINAMelt.
To avoid this problem, we can use known orthogonal sequences [1]. These include 37 different
DNA sequences, each 23 bases long, which have been reported not to interfere with each other
in an unexpected manner.
Sequence Design Software
However, the length of the number and the kind of bases will differ between designs, and each
design will need a sequence(s) that meets its conditions. Here, sequence design software is
effective. It has been developed by many groups, but there is no de facto standard software
program because different sequence design problems have different conditions in detail. Some
software programs are introduced below.
Design Function of NUPACK
The online secondary structure prediction program NUPACK [2,3] has a sequence design func-
tion. Access the NUPACK website and select the Design tab. The succeeding steps are described
below, using a demo.
First, set the parameters at the top. For example, Nucleic acid type: DNA, Temperature: 26°C,
and Number of designs: 2. Then write a desired secondary structure in the Target structure box
by the dot-bracket-plus notation. In this notation, as shown in FIG. 1, “.” represents an unpaired
base, corresponding left and right brackets represent a hydrogen-bonded base pair, and “+”
represents a DNA nick.
Chapter 6 Software Techniques 244
Dot-bracket-plus notation
Then, push Design at the bottom right and wait while the sequence design is executed and
base sequence sets are outputted as many as the number of designs set in FIG. 2. FIG. 2 shows
sequence design results for the secondary structure of the stick figure. The higher a set is placed,
the better sequence(s) it has.
Chapter 6 Software Techniques 245
Push “To Analysis” on the right side of the designed sequences to subject the corresponding
sequence to secondary structure analysis. You can check if the expected secondary structure is
formed. If any unexpected structure is formed, you have to change the conditions and structure
and repeat the sequence design.
You can add a variety of constraints in the Target structure box (see NUPACK Help for details).
For example, you can give a predetermined sequence to a portion of the structure, or you can
specify a sequence to be excluded.
DNA Design for Tile Design
A DNA Design toolbox [4] is useful for designing DNA tiles. This tool is written in MATLAB. A
base sequence is written as a string of alphabetical letters, and a Watson-Crick pairing region is
represented by a string of numbers. Each base to be designed is represented by an alphabetical
letter other than ATGC, e.g. N. Although you need some knowledge of programming, you can
design any combination of sequences once you get familiar with it.
Software for Combinatorial Optimization
To perform design and avoid unexpected structures, you need a sequence design program that
takes sequence homology and the free energy of secondary structure (see section 41) into
account [5,6]. Under conditions with various trade-offs, it searches for a sequence of the best
combination.
A few software programs for this are available online. One example is CircDesigNA [7], which
provides sequence design based on energy calculation. This software has the interface shown
at the top in FIG. 3, through which the user inputs the letter string that describes a desired
Chapter 6 Software Techniques 246
secondary structure.
Tohoku University has also developed the sequence design software Sequence Design based
on references [5,6]. This sequence design software for DNA computing was employed by the
Tohoku University team in 2014. It has an interface as shown at the bottom in FIG. 3, and
describes the required DNA with a string of alphabetical letters. Contact [8] to use it.
Cadnano
This freeware can design two-dimensional/three-dimensional structures on a GUI basis. The
latest is ver2.2.0 (as of March 2015).
Installation of caDNAno
Access the HP to download necessary files (http://cadnano.org/). For Mac, Maya2012 has to be
installed in advance. For Windows, Maya2012 and Python2.7.2. have to be installed in advance.
These can be installed even in a laptop and tablet and work fine for design purposes. However,
you may sometimes need the large memory and graphics board of a desktop when you want to
display a three-dimensional structure on Maya.
The start-up window of caDNAno is composed of two areas. The left one displays the top view of
a DNA nanostructure, the right displays the DNA nanostructure projected to the plane and seen
from the side.
Select the “Honeycomb” or “Square” button at the top left. When a three-dimensional structure
is desired, “Honeycomb” is usually selected. In Honeycomb, each designed structure has a
base unit of 7 bases (a rotation of 240˚) (Douglas et al. Nature 2009). In Square, each designed
structure has a base unit of 8 bases (0.75 turn = a rotation of 270˚. 10.67 bp/turn) (Rothemund
Nature 2006). Note that an actual B-form DNA has a pitch of 10.4 bp/turn, which is different from
the designed value in Square, resulting in a slightly twisted structure. Thus you have to eliminate
one base every 48 bases (Woo & Rothemund NatChem 2011).
Numbered cylinders are created in the order you click. Neighboring cylinders do not need to
have serial numbers, but serial numbers will be easier for beginners to handle. The coordinates
of the mouse pointer (cylinder number and base number) are shown at the bottom left.
Chapter 6 Software Techniques 248
Each cylinder is made of two rows. Use the upper row to draw a line rightward, and the lower
row to draw a line leftward. The square in each arrow represents the 5’ end, and the arrowhead
represents the 3’ end. This line represents the scaffold. When you use “select” or the pen,
focus your interest (for example, only on the scaffold and endpoints (red circle) to facilitate the
operation.
Chapter 6 Software Techniques 249
After drawing lines for cylinders, connect the cylinders (lines). When you click approximately
where you want to introduce a connection, a number and square brackets will appear. The
number is the destination cylinder number. Click the square brackets to connect the lines and
form a crossover. Select and delete the remainder by the delete key. If you want to extend a
scaffold, select it and drag & drop the square or arrowhead in the arrow. If you want to move a
crossover, turn “(X)overs” selectable by “Selectable” at the center of the upper toolbar before
that.
Chapter 6 Software Techniques 250
After connecting all the scaffolds by one stroke drawing (if you want to connect places other
than corner brackets, do it by force using the Pen tool), add staple strands. Push “AutoStaple” to
automatically arrange them, and then correct them by hand, one by one. This correction is for
their length and crossover positions, and is empirical for most part. Each staple to be corrected
is represented by a bold line. Through cutting, length change, or rearrangement, adjust each
staple to a practical length.
Chapter 6 Software Techniques 251
Subsequently, assign specific sequences. When you push the “Seq” button at the bottom right,
scaffolds are suggested. Choose appropriate ones, and the sequences will be automatically
assigned. Unless you want something different, make sure that neither too many nor too few
bases are assigned to each scaffold in this step.
You can output the sequence in csv format by pushing the “Export” button at the top left. If
the sequence is a string containing “?”, that staple is not paired with the scaffold. Unless it is
deliberate, you have to adjust the length of the staple and/or the scaffold, and then repeat “Seq”.
You can also use “Paint” to display different staples in different colors, one base insertion
(“Insert”), one base deletion (“Skip”), and other buttons, as appropriate.
You can use CanDo developed by an MIT group (http://cando-dna-origami.org/). Soon after
the upload of a sequence, it will return the calculation results. Because this program has
considerable constraints, you should not expect too much from it.
You can order custom sequences from any oligonucleotide synthesis service company. You will
receive them in about 4 business days, in separate tubes or in a multiwell plate format.
—Hisashi Tadakuma (Kyoto University)
Chapter 6 Software Techniques 252
Alternatively, it can output a video or pdb file (see section 66). The simulation process is roughly
as follows.
In principle, parameters do not need to be changed from their default values. Select Square
or Honeycomb for “Lattice type” as appropriate. If you want a video and/or pdb file, set corre-
sponding options. Note that it may take several hours to days to get the result for a submitted
file because requests submitted will be processed in the order they are received.
The simulation results are shown in FIG. 2. The output is a structure that is stable in a solution,
with the color indicating the degree of thermal vibration. In the video, you can see it is actually
undergoing thermal vibration. In the exemplary structure, a small difference in phase between
the double helices accumulates, resulting in a slightly twisted structure. From the structural
characteristics, it is likely to be adsorbed with the back face up. When observed under AFM (see
section 18), about 60% of the structures actually had their back face up.
Chapter 6 Software Techniques 254
The first line converts a json file into an appropriate file. The second line is for relaxation, and
the third line executes the simulation. Temperature, time interval, and other conditions are
described by “input_relax” and “input”, for which sample files attached to the software can
be modified before use. The fourth line converts the output file into a different file format
to visualize it. Note that a standard personal computer will need several hours to simulate a
phenomenon that takes nanoseconds.
The visualized simulation results are shown in FIG. 3. The formation of a stable structure through
thermal vibration, with cracks and distortions in the structure, can be observed. By writing a
script to analyze it, the time course of the distance between any two points can be tracked.
Uses of Simulation
Because simulation can provide a kind of information that you would never obtain through
designing DNA origami using a classical methodology, you should do this early in your project so
that you can create a more detailed design. In addition, the output of three-dimensional models
can be used to make a video or the like (see sections 57 and 59). A recent study reported an all-
atom molecular dynamics simulation [6], but it may not be usable in BIOMOD due to its high
computing cost.
—Ibuki Kawamata (Tohoku University)
Its executable file is not directly available, and thus the source code must be compiled in an
environment with X Windows System. An execution screen is shown on the left in FIG. 1. For a
simple single or double strand DNA, you just have to select File -> Generate and input a sequence
to obtain its all-atom model. “Deoxy” in the “B” form should be selected.
You can store the obtained model in a format compatible with other applications, such as pdb,
by selecting File -> Write. The structure can be imported into the software programs VMD [3,4]
and Chimera [5,6] and displayed.
In Namot, you can use the “add unit” command to program the position of each nucleotide. For
details, see the online manual [1]. The right panel in FIG. 1 shows a structure generated from a
demo file included in Namot, visualized by VMD. You can create a superhelix structure formed by
a double helix, a branched structure formed by a double helix protruding upward, downward,
leftward, rightward, forward, and backward, etc.
Recent studies have investigated an all-atom model of DNA origami. As shown in FIG. 2, an all-
atom model can be created from a caDNAno file (provided from Tohoku University).
Chapter 6 Software Techniques 258
Molecular Dynamics
Based on the all-atom model obtained, you can perform molecular dynamics simulations
based on classical physics. Popular simulation software programs are NAMD [7], Amber [8], and
Charmm [9].
You can consult the NAMD tutorial for a molecular dynamics simulation of DNA, with a pdb file as
input. FIG. 3 shows the result of a simulation of double helix dissociation process at an elevated
temperature. Note that all-atom molecular dynamics usually requires expensive computing.
structure with T-motifs [11]. By selecting Build -> DNA -> Insert DNA, you can generate a three-
dimensional model of a double helix that has the same geometry as DNA. It allows you to draw
differently oriented helices, cut and connect terminals, and so on, in an intuitive manner. It is an
effective software program for a design where only the phase and pitch of helix are important.
larger system, newly introduced sequences might interfere with the original system, causing
unexpected reactions.
To avoid this, the behavior of the system should be predicted in advance by numerical sim-
ulation based on chemical reaction equation, i.e. continuous-deterministic simulation [1,2]
by formulating an ordinary differential equation based on reaction kinetics (see section 39)
and using a numerical analysis library (see section 60). We can apply discrete-event/stochastic
simulation to a relatively small number of molecules in a compartmentalized environment, such
as liposomes (see section 89).
As the number of molecular species involved increases, the differential equation will include
more and more variables, and thus will become difficult to manually formulate and analyze
repeatedly. It will also require a technique to determine the reaction rate constant and other
parameters.
In the case of DNA reactions, the elementary reactions of which are very limited, e.g. to strand
displacement reaction (see section 42) and enzymatic reaction (see section 45), there are design
assistant software programs that can automate formulation and analysis. In the design and
simulation of a system such as an enzymatic reaction-based oscillator, the DACCAD tool is
effective [3] (see section 33). It features intuitive, graph-based operations.
DSD uses its original notation to describe DNA [6]. FIG. 1 summarizes an exercise found in the
manual. Each domain is indicated with a letter with/without a number, and its complementary
strand is marked with “*”. Each short toehold sequence is marked with “^”. The symbol of each
rightward single strand is placed in “< >”, each leftward single strand in “{ }”, and each double
strand in “[ ]”. “:” denotes concatenation along a leftward single strand, while “::” denotes
concatenation along a rightward single strand.
Chapter 6 Software Techniques 261
The concentration of each molecule at the initial state is written as a program in the Code
window (left), which is then inputted into the software. You can also add reaction rate constants,
simulation time, and molecules to be displayed in the graph to the program. The software
includes various sample programs, and you can create a program for amplifying gate (see
section 31) by selecting “Catalytic” from “Examples”.
Once the programming is done, compile it by clicking the “Compile” button at the top left. If
the program has not been written appropriately, this step will return an error (then you have
to correct it). After the compilation is successful, possible structures that can be reached from
the initial state are investigated exhaustively to formulate ordinary differential equations. Select
the Graph tab under Compilation over the right window to display a graph of reaction paths, as
shown in the upper panel of FIG. 3.
Then, click the “Simulate” button on its side to display the numerical simulation results. Unless
Chapter 6 Software Techniques 263
you change them, the parameters have default values and settings. You can choose between
“Stochastic” (stochastic simulation) at default and “Deterministic” (deterministic simulation).
A result of selecting the Plot1 tab under Simulation in the bottom right window is shown
at the bottom of FIG. 3, wherein the horizontal axis represents time and the vertical axis
represents concentration. It provides various options; for example, you can change the settings
of Compilation and Options at the top right to perform an in-depth simulation.
This software program can predict the time course of concentration, and is useful for system
design/analysis.
Chapter 6 Software Techniques 264