programming python

Dự án python cho người mới bắt đầu cách tiếp cận bootcamp mười tuần để lập trình python pdf

Immerse yourself in learning Python and introductory data analytics with this book’s project-based approach. Through the structure of a ten-week coding bootcamp course, you’ll learn key concepts and gain hands-on experience through weekly projects

Each chapter in this book is presented as a full week of topics, with Monday through Thursday covering specific concepts, leading up to Friday, when you are challenged to create a project using the skills learned throughout the week. Topics include Python basics and essential intermediate concepts such as list comprehension, generators and iterators, understanding algorithmic complexity, and data analysis with pandas. From beginning to end, this book builds up your abilities through exercises and challenges, culminating in your solid understanding of Python

Challenge yourself with the intensity of a coding bootcamp experience or learn at your own pace. With this hands-on learning approach, you will gain the skills you need to jumpstart a new career in programming or further your current one as a software developer

What You Will Learn

Understand beginning and more advanced concepts of the Python language
Be introduced to data analysis using pandas, the Python Data Analysis library
Walk through the process of interviewing and answering technical questions
Create real-world applications with the Python language
Learn how to use Anaconda, Jupyter Notebooks, and the Python Shell

Who This Book Is For

Those trying to jumpstart a new career into programming, and those already in the software development industry and would like to learn Python programming

Python Projects for Beginners A Ten-Week Bootcamp Approach to Python Programming — Connor P. Milliken

Python Projects for Beginners A Ten-Week Bootcamp Approach to Python Programming

Connor P. Milliken

Python Projects for Beginners Connor P. Milliken Derry, NH, USA ISBN-13 [pbk]. 978-1-4842-5354-0 https. //doi. org/10. 1007/978-1-4842-5355-7

ISBN-13 [electronic]. 978-1-4842-5355-7

Copyright © 2020 by Connor P. Milliken This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether they are subject to proprietary rights. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Managing Director, Apress Media LLC. Welmoed Spahr Acquisitions Editor. Nikhil Karkal Development Editor. Rita Fernando Coordinating Editor. Divya Modi Cover designed by eStudioCalamar Cover image designed by Pixabay Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax [201] 348-4505, e-mail [email protected], or visit www. springeronline. com. Apress Media, LLC is a California LLC and the sole member [owner] is Springer Science + Business Media Finance Inc [SSBM Finance Inc]. SSBM Finance Inc is a Delaware corporation. For information on translations, please e-mail [email protected], or visit http. //www. apress. com/ rights-permissions. Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Print and eBook Bulk Sales web page at http. //www. apress. com/bulk-sales. Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub via the book's product page, located at www. apress. com/978-1-4842-5354-0. For more detailed information, please visit http. //www. apress. com/source-code. Printed on acid-free paper

This book is dedicated to my girlfriend Jess. Ever since we first met, you changed my life forever. There’s so much that I wish to tell you each day, like how beautiful you are, how you inspire me, or how I would give anything just to be with you every second of the day. Your smile lights up my whole world and you make me so unbelievably happy. Anytime I have a bad day, I know you’ll always be there for me. I thought that I would only find you in my dreams, but here you are, standing in front of me, looking beautiful as ever. From the day I met you, I knew I wanted to give you everything. You’re smart, motivated, beautiful, and resemble all that is right with this world. If I only do one thing right in life, I’d like it to be you. I promise to always push you to be better, always support you in times of need, and always be there with a Werther's candy to help you study. Your dreams have become my dreams, and whatever you want in life, I want to be there to celebrate and help guide you. I will always love you, past forever, with all my heart and soul. So I have only one question left for you… [turn the page]

Will You Marry Me?

Table of Contents About the Author��xxi About the Technical Reviewer��xxiii Acknowledgments��xxv Chapter 1. Getting Started�� 1 Monday. Introduction�� 2 What Is Python?�� 2 Why Python?�� 3 Why This Book?�� 4 Who This Book Is For?�� 4 What You’ll Learn�� 5 Tuesday. Setting Up Anaconda and Python�� 6 Cross-Platform Development�� 6 Installing Anaconda and Python for Windows�� 6 What Is Anaconda?�� 8 What Is Jupyter Notebook?�� 8 Wednesday. How to Use the Terminal�� 9 Changing Directories�� 9 Checking the Directory�� 10 Making Directories�� 10 Creating Files�� 10 Checking a Version Number�� 11 Clearing the Terminal Output�� 11 Using the Python Shell�� 12 Writing Your First Line of Python�� 12 Exiting the Python Shell�� 13 v

Table of Contents

Thursday. Using Jupyter Notebook�� 13 Opening Jupyter Notebook�� 14 Creating a Python File�� 14 Jupyter Notebook Cells�� 15 Friday. Creating Your First Program�� 17 Line Numbers Introduced�� 17 Creating the Program�� 18 Final Output�� 19 Weekly Summary�� 20 Weekly Challenges�� 20

Chapter 2. Python Basics�� 21 Monday. Comments and Basic Data Types�� 22 What Are Comments and Why Use Them?�� 22 Writing Comments�� 23 What Are Data Types?�� 24 The Print Statement�� 24 Integers�� 25 Floats�� 25 Booleans�� 25 Strings�� 26 Tuesday. Variables�� 27 How They Work�� 27 Handling Naming Errors�� 28 Integer and Float Variables�� 28 Boolean Variables�� 29 String Variables�� 29 Using Multiple Variables�� 29 Using Operators on Numerical Variables�� 30 Overwriting Previously Created Variables�� 30 Whitespace�� 31

Table of Contents

Wednesday. Working with Strings�� 31 String Concatenation�� 32 Formatting Strings�� 32 String Index�� 34 String Slicing�� 36 Thursday. String Manipulation�� 37 . title[ ]�� 37 . replace[ ]�� 37 . find[ ]�� 38 . strip[ ]�� 38 . split[ ]�� 39 Friday. Creating a Receipt Printing Program�� 39 Final Design�� 40 Initial Process�� 40 Defining Our Variables�� 41 Creating the Top Border�� 42 Displaying the Company Info�� 42 Displaying the Product Info�� 43 Displaying the Total�� 44 Displaying the Ending Message�� 44 Displaying the Bottom Border�� 45 Weekly Summary�� 45 Challenge Question Solution�� 45 Weekly Challenges�� 46

Chapter 3. User Input and Conditionals�� 47 Monday. User Input and Type Converting�� 48 Accepting User Input�� 48 Storing User Input�� 48 What Is Type Converting?�� 49 Checking the Type�� 49

vii

Table of Contents

Converting Data Types�� 49 Converting User Input�� 50 Handling Errors�� 51 Code Blocks and Indentation�� 52 Tuesday. If Statements�� 52 How They Work�� 53 Writing Your First If Statement�� 53 Comparison Operators�� 54 Checking User Input�� 54 Logical Operators�� 55 Membership Operators�� 56 Wednesday. Elif Statements�� 58 How They Work�� 58 Writing Your First Elif Statement�� 59 Checking Multiple Elif Conditions�� 59 Conditionals Within Conditionals�� 60 If Statements vs. Elif Statements�� 60 Thursday. Else Statements�� 62 How They Work�� 62 Writing Your First Else Statement�� 62 Complete Conditional Statement�� 63 Friday. Creating a Calculator�� 64 Final Design�� 65 Step #1. Ask User for Calculation to Be Performed�� 65 Step #2. Ask for Numbers, Alert Order Matters�� 66 Step #3. Set Up Try/Except for Mathematical Operation�� 66 Final Output�� 67 Weekly Summary�� 69 Challenge Question Solution�� 69 Weekly Challenges�� 69

viii

Table of Contents

Chapter 4. Lists and Loops�� 71 Monday. Lists�� 72 What Are Lists?�� 72 Declaring a List of Numbers�� 72 Accessing Elements Within a List�� 73 Declaring a List of Mixed Data Types�� 73 Lists Within Lists�� 74 Accessing Lists Within Lists�� 74 Changing Values in a List�� 75 Variable Storage�� 76 Copying a List�� 77 Tuesday. For Loops�� 78 How Loops Work�� 78 Writing a For Loop�� 78 Range[]�� 80 Looping by Element�� 80 Continue Statement�� 81 Break Statement�� 82 Pass Statement�� 82 Wednesday. While Loops�� 83 Writing a While Loop�� 84 While vs. For�� 84 Infinite Loops�� 84 Nested Loops�� 85 Thursday. Working with Lists�� 86 Checking Length�� 87 Slicing Lists�� 87 Adding Items�� 88 Removing Items�� 88 Working with Numerical List Data�� 90 Sorting a List�� 90 ix

Table of Contents

Conditionals and Lists�� 91 Loops and Lists�� 92 Friday. Creating Hangman�� 93 Final Design�� 94 Previous Line Symbols Introduced�� 94 Adding Imports�� 95 Declaring Game Variables�� 96 Generating the Hidden Word�� 96 Creating the Game Loop�� 97 Outputting Game Information�� 97 Checking a Guess�� 98 Clearing Output�� 98 Creating the Losing Condition�� 99 Handling Correct Guesses�� 99 Creating a Winning Condition�� 100 Outputting Guessed Letters�� 101 Adding Guessed Letters�� 101 Handling Previous Guesses�� 102 Final Output�� 102 Weekly Summary�� 103 Challenge Question Solution�� 103 Weekly Challenges�� 104

Chapter 5. Functions�� 105 Monday. Creating and Calling Functions�� 106 What Are Functions?�� 106 Function Syntax�� 107 Writing Your First Function�� 107 Function Stages�� 108 UDF vs. Built-in�� 109 Performing a Calculation�� 109

Table of Contents

Tuesday. Parameters�� 110 What Are Parameters?�� 110 Passing a Single Parameter�� 111 Multiple Parameters�� 111 Passing a List�� 112 Default Parameters�� 113 Making Parameters Optional�� 113 Named Parameter Assignment�� 114 *args�� 114 **kwargs�� 115 Wednesday. Return Statement�� 116 How It Works�� 116 Using Return�� 117 Ternary Operator�� 118 Thursday. Scope�� 119 Types of Scope�� 119 Global Scope Access�� 119 Handling Function Scope�� 120 In-Place Algorithms�� 120 Friday. Creating a Shopping Cart�� 121 Final Design�� 122 Initial Setup�� 122 Adding Items�� 123 Removing Items�� 123 Showing the Cart�� 124 Clearing the Cart�� 124 Creating the Main Loop�� 124 Handling User Input�� 125 Final Output�� 126

Table of Contents

Weekly Summary�� 126 Challenge Question Solution�� 127 Weekly Challenges�� 127

Chapter 6. Data Collections and Files�� 129 Monday. Dictionaries�� 129 What Are Dictionaries?�� 130 Declaring a Dictionary�� 130 Accessing Dictionary Information�� 131 Using the Get Method�� 131 Dictionaries with Lists�� 132 Lists with Dictionaries�� 132 Dictionaries with Dictionaries�� 133 Tuesday. Working with Dictionaries�� 134 Adding New Information�� 134 Changing Information�� 135 Deleting Information�� 135 Looping a Dictionary�� 135 Wednesday. Tuples, Sets, Frozensets�� 137 What Are Tuples?�� 137 Declaring a Tuple�� 138 What Are Sets?�� 138 Declaring a Set�� 138 What Are Frozensets?�� 139 Declaring a Frozenset�� 139 Data Collection Differences�� 140 Thursday. Reading and Writing Files�� 140 Working with Text Files�� 141 Writing to CSV Files�� 142 Reading from CSV Files�� 142 File Modes in Python�� 143

xii

Table of Contents

Friday. Creating a User Database with CSV Files�� 144 Final Design�� 144 Setting Up Necessary Imports�� 145 Handling User Registration�� 145 Handling User Login�� 146 Creating the Main Loop�� 147 Weekly Summary�� 148 Challenge Question Solution�� 149 Weekly Challenges�� 149

Chapter 7. Object-Oriented Programming�� 151 Monday. Creating and Instantiating a Class�� 152 What Is an Object?�� 152 OOP Stages�� 153 Creating a Class�� 153 Creating an Instance�� 154 Creating Multiple Instances�� 154 Tuesday. Attributes�� 156 Declaring and Accessing Attributes�� 156 Changing an Instance Attributes�� 157 Using the __init__[ ] Method�� 157 The “self” Keyword�� 158 Instantiating Multiple Objects with __init__[ ]�� 159 Global Attributes vs. Instance Attributes�� 159 Wednesday. Methods�� 161 Defining and Calling a Method�� 161 Accessing Class Attributes in Methods�� 162 Method Scope�� 162 Passing Arguments into Methods�� 163 Using Setters and Getters�� 164 Incrementing Attributes with Methods�� 165

xiii

Table of Contents

Methods Calling Methods�� 166 Magic Methods�� 166 Thursday. Inheritance�� 168 What Is Inheritance?�� 168 Inheriting a Class�� 168 Using the super[ ] Method�� 169 Method Overriding�� 170 Inheriting Multiple Classes�� 171 Friday. Creating Blackjack�� 172 Final Design�� 173 Setting Up Imports�� 174 Creating the Game Class�� 174 Generating the Deck�� 175 Pulling a Card from the Deck�� 175 Creating a Player Class�� 176 Adding Cards to the Player’s Hand�� 177 Showing a Player’s Hand�� 178 Calculating the Hand Total�� 179 Handling the Player’s Turn�� 181 Handling the Dealer’s Turn�� 182 Calculating a Winner�� 183 Final Output�� 184 Weekly Summary�� 184 Challenge Question Solution�� 185 Weekly Challenges�� 185

Chapter 8. Advanced Topics I. Efficiency�� 187 Monday. List Comprehension�� 188 List Comprehension Syntax�� 188 Generating a List of Numbers�� 189 If Statements�� 190 If-Else Statements�� 190 xiv

Table of Contents

List Comprehension with Variables�� 191 Dictionary Comprehension�� 192 Tuesday. Lambda Functions�� 193 Lambda Function Syntax�� 193 Using a Lambda�� 193 Passing Multiple Arguments�� 194 Saving Lambda Functions�� 195 Conditional Statements�� 195 Returning a Lambda�� 196 Wednesday. Map, Filter, and Reduce�� 197 Map Without Lambdas�� 197 Map with Lambdas�� 198 Filter Without Lambdas�� 199 Filter with Lambdas�� 200 The Problem with Reduce�� 201 Using Reduce�� 201 Thursday. Recursive Functions and Memoization�� 203 Understanding Recursive Functions�� 203 Writing a Factorial Function�� 204 The Fibonacci Sequence�� 205 Understanding Memoization�� 206 Using Memoization�� 207 Using @lru_cache�� 208 Friday. Writing a Binary Search�� 209 Final Design�� 209 Program Setup�� 211 Step 1. Sort the List�� 211 Step 2. Find the Middle Index�� 212 Step 3. Check the Value at the Middle Index�� 213 Step 4. Check if Value Is Greater�� 213 Step 5. Check if Value Is Less�� 214 xv

Table of Contents

Step 6. Set Up a Loop to Repeat Steps�� 214 Step 7. Return False Otherwise�� 215 Final Output�� 216 Weekly Summary�� 217 Challenge Question Solution�� 217 Weekly Challenges�� 218

Chapter 9. Advanced Topics II. Complexity�� 219 Monday. Generators and Iterators�� 220 Iterators vs. Iterables�� 220 Creating a Basic Iterator�� 220 Creating Our Own Iterator�� 221 What Are Generators?�� 222 Creating a Range Generator�� 222 Tuesday. Decorators�� 224 What Are Decorators?�� 224 Higher-Order Functions�� 225 Creating and Applying a Decorator�� 225 Decorators with Parameters�� 226 Functions with Decorators and Parameters�� 226 Restricting Function Access�� 227 Wednesday. Modules�� 229 Importing a Module�� 229 Importing Only Variables and Functions�� 230 Using an Alias�� 231 Creating Our Own Module�� 231 Using Our Module in Jupyter Notebook�� 232 Thursday. Understanding Algorithmic Complexity�� 234 What Is Big O Notation?�� 234 Hash Tables�� 236 Dictionaries vs. Lists�� 238 Battle of the Algorithms�� 239 xvi

Table of Contents

Friday. Interview Prep�� 241 Developer Interview Process�� 241 What to Do Before the Interview�� 243 General Questions�� 245 Whiteboarding and Technical Questions�� 248 End of Interview Questions�� 249 What to Do After the Interview�� 250 Weekly Summary�� 251 Challenge Question Solution�� 252 Weekly Challenges�� 252

Chapter 10. Introduction to Data Analysis�� 253 Monday. Virtual Environments and Requests Module�� 254 What Are Virtual Environments?�� 254 What Is Pip?�� 256 Creating a Virtual Environment�� 256 Activating the Virtual Environment�� 257 Installing Packages�� 258 APIs and the Requests Module�� 259 Using the Requests Module�� 259 Tuesday. Pandas�� 263 What Is Pandas?�� 263 Key Terms�� 264 Installing Pandas�� 265 Importing Pandas�� 265 Creating a DataFrame�� 265 Accessing Data�� 267 Built-in Methods�� 268 Filtration�� 271 Column Transformations�� 272 Aggregations�� 274

xvii

Table of Contents

Pandas Joins�� 277 Dataset Pipeline�� 280 Wednesday. Data Visualization�� 281 Types of Charts�� 282 Installing Matplotlib�� 282 Importing Matplotlib�� 283 Line Plot�� 283 Bar Plot�� 285 Box Plot�� 286 Scatter Plot�� 288 Histogram�� 289 Saving the Chart�� 292 Flattening Multidimensional Data�� 293 Thursday. Web Scraping�� 295 Installing Beautiful Soup�� 295 Importing Beautiful Soup�� 295 Requesting Page Content�� 296 Parsing the Response with Beautiful Soup�� 297 Scraping Data�� 297 DOM Traversal�� 299 Friday. Web Site Analysis�� 304 Final Design�� 304 Importing Libraries�� 306 Creating the Main Loop�� 307 Scraping the Web Site�� 307 Scrape All Text�� 308 Filtering Elements�� 309 Filtering Waste�� 310 Count Word Frequency�� 312 Sort Dictionary by Word Frequency�� 313 Displaying the Top Word�� 313 xviii

Table of Contents

Graphing the Results�� 314 Final Output�� 315 Weekly Summary�� 315 Challenge Question Solution�� 316 Weekly Challenges�� 316

Afterword. Post-Course. What to Do Now?�� 319 Back-End Development with Python�� 319 Full-Stack Development with Python�� 320 Data Analysis with Python�� 320 Data Science with Python�� 320 Resources�� 320 Final Message�� 323

Index�� 325

xix

About the Author Connor P. Milliken Focused on helping others achieve their goals through education and technology, Connor P. Milliken brings a wealth of programming and business experience to his classes. He graduated with a computer science degree from Daniel Webster College and is pursuing a master’s in computer science with a focus in interactive intelligence from Georgia Tech. Before becoming an instructor at Coding Temple, he was designing simulators in the video game industry for several years. During that time, he took on a vast number of roles from business to programming that he used to release a total of 11 different titles on PC and co-created an award-winning football card game called “Masters of the Gridiron. ” Connor has experience in more than seven different languages and three frameworks. He focuses primarily in web development and data analytics using Python. When this book was written, he taught for a coding bootcamp in Boston, MA, where students can learn Python, web development, and data analytics over a 10-week full-time course. He is now a software engineer at Hubspot, Inc. in Cambridge, MA. Github. Connor-SM

xxi

About the Technical Reviewer Bharath Thiruveedula currently works for a major telco service provider. He is core reviewer and key contributor to various OpenStack/ONAP projects. Bharath is passionate about open source technologies and is an evangelist who is focused on making his mark in the Cloud/Container domains. He has been working on distributed systems and machine learning for a significant amount of time

xxiii

Acknowledgments I would like to thank the following people for their generosity and help. Jessica Boucher, who has been my rock this whole time. Your love and support have continued to help me in all my endeavors. I’m truly blessed to have you in my life. My family, who have supported and believed in me all my life. Without your guidance, none of this would be possible. To have parents and siblings like you all is nothing short of a miracle and I wouldn’t have it any other way. Clay and Dee Dreslough, who gave me an opportunity and mentored me. This book would not be possible without your guidance over the years. It was at Sports Mogul that I had realized my passion of computer programming, thanks to you both. Derek Hawkins, who mentored and taught me a lot about teaching, programming, Python, and Ping Pong. Kirsten Arnold, who created all the art within this book. The work you were able to create from my poor drawing skills was exactly what I had imagined. Ripal Patel, who helped with the interview portion of Week 9. Your expertise in the hiring and interview process has been wonderful for not only me but the students. My friends, who over the years have been there for me through it all. Whether it was watching my dog, going on adventures, or just hanging out… thank you. I will always make the drive for you all. My coaches, who taught me about perseverance, hard work, commitment, and teamwork. Whether it was 6 AM practices or triple sessions in the middle of summer, you’ve played a big part in my life and for that I’m grateful. xxv

Acknowledgments

The Coding Temple team, who gave me the opportunity and entrusted me to educate those wanting to pursue a career in tech. The Apress team, who have helped me throughout this entire process with writing, formatting, reviewing, and more. My students, who helped to show me why teaching is so rewarding

xxvi

CHAPTER 1

Getting Started Hello there. Welcome to your first step toward becoming a Python developer. Exciting isn’t it? Whether you’re just beginning to learn how to program, or have experience in other languages, the lessons taught in this book will help to accelerate your goals. As a Python instructor, I can guarantee you that it’s not about where you start, it’s about how hard you’re willing to work. At the time of writing this book, my daily job is a coding bootcamp instructor where I teach students how to go from zero programming experience to professional developers in just ten weeks. This book was designed with the intent to bring a bootcamp-based approach to text. This book aims to help you learn subjects that are valuable to becoming a professional developer with Python. Each subsequent chapter will have an overview and a brief description of what we’ll cover that week. This week we’ll be covering all the necessary basics to get us jump started. Following the age old saying, “You must learn to walk before you can run,” we must understand what our tools are and how to use them before we can begin coding. Overview •

Understanding why and how this book works

•

Installing Python and Anaconda

•

Understanding how to use these new tools

•

Understanding how to use the terminal

•

Writing your first Python program

Without further ado, let’s get started, shall we?

Chapter 1

Getting Started

Monday. Introduction Almost every programmer remembers that “Aha. ” moment, when everything clicked for them. For me that was when I picked up Python. After years of computer science education, one of the best methods I found to learn was by building applications and applying the knowledge you learn. That’s why this book will have you coding along rather than reading about the theory behind programming. Python makes it simple to pick up concepts otherwise difficult in other languages. This makes it a great language for breaking into the development industry. You may have already noticed that the structure of this book is different than most. Instead of chapters, we have each topic separated by weeks or days. Notice the current header for the section. This is part of the bootcamp-based approach, so that you may set goals for each day. There will be two ways to follow along this book. 1. Over the course of ten weeks 2. Over the course of ten days If you’d like to follow the 10-week approach, then think of each chapter as a weekly goal. All chapters are broken up further into daily segments Monday to Friday. The first four days, Monday through Thursday, will introduce new concepts to understand. Friday, or better known as Project Day, is where we will create a program together based on the lessons learned throughout the week. The focus is that you set aside 30–60 minutes each day to complete each daily task. If you’re eager enough to try the bootcamp style, where you learn all the material in ten days, then think of each chapter as a single day. Granted, you must know that in order to complete this book in ten days, you will need to dedicate around 8 hours per day, which is a typical day for coding bootcamp students. In bootcamps [like the one I taught], we go over several concepts daily, and each subsequent day we reiterate the topics learned from previous lessons. This helps to accelerate the process of learning each concept

What Is Python? Python is an interpreted, high-level, general-purpose programming language. To understand what each of these descriptions mean, let’s make a few comparisons

Chapter 1

Getting Started

•

Low Level vs. High Level. Refers to whether we program using instructions and data objects at the level of the machine or whether we program using more abstract operations that have been provided by the language designer. Low-level languages [like C, C++] require you to allocate and manage memory, whereas Python manages memory for us

•

General Purpose vs. Targeted. Refers to whether the operations of the programming language are widely applicable or are fine-tuned to a domain. For example, SQL is a targeted language that is designed to facilitate extracting information from relational databases, but you wouldn’t want to use it to build an operating system

•

Interpreted vs. Compiled. Refers to whether the sequence of instructions written by the programmer, called “source code,” is executed directly [by an interpreter] or whether it is first converted [by a compiler] into a sequence of machine-level primitive operations. Most applications designed with Python are run through the interpreter, so errors are found at runtime

Python also emphasizes code readability and uses whitespace to separate snippets of code. We’ll learn more about how whitespace in Python works as we get into our lessons, but for now just know that Python is a great first language to break into the computer science industry

Why Python? I could go on about why Python is so amazing, but a simple Google search would do that for me. Python is one of the easier languages to learn. Notice I said “easier” and not “easy”… that’s because programming is still difficult, but Python reads closer to the English language than most other languages. This is one of the benefits of learning Python, because concepts that you learn from this book are still applicable to other languages. Python is also one of the most sought-after skills in the technology industry today, used by companies such as Google, Facebook, IBM, etc. It’s been used to build applications like Instagram, Pinterest, Dropbox, and much more

Chapter 1

Getting Started

It’s also one of the fastest growing languages in 2019, climbing to the top 3 languages to learn for the future. 1 How well does it pay though? According to Indeed. com, the average salary in 2018 was around $117,000 USD. 2 That’s a lot of monopoly money. One of the biggest reasons for learning Python, though, must be the use of the language itself. It’s used in several different industries. front-end development, back-end development, full-stack, testing, data analytics, data science, web design, etc. , which makes it a useful language

Why This Book? Let’s start with the main reason for wanting to read this book. The material taught throughout this book has a proven track record. I’ve personally used this exact organization approach to help get my students well-paying positions across a variety of industries. The structure of this curriculum has been repeatedly improved over the years to stick with current industry trends. One of the next great strengths of this book vs. its competitors is how the concepts are taught. I won’t bore you with details; instead we’ll build small- and large-scale applications together throughout the course of this book. The best way to learn is often by doing. Especially when it comes to programming, one of the lessons I often tell students is to just try writing the code, and if it breaks, fix it. You won’t be able to learn if you don’t try to break things. Lastly, this book will not only teach you how to program but how to think like a programmer. At the beginning of each week, I’ll challenge you, and by the end of the lesson, you’ll be able to understand the approach you need to take. You can always tell the difference between those who are only able to program and those that are proven developers

Who This Book Is For? It’s always good to understand what you’re getting into before you start reading the book. To want to read a book, you first must realize if the book itself is designed for you. If you can answer yes to any of the following questions, then this book is for you

w ww. tiobe. com/tiobe-index/ www. indeed. com/salaries/Python-Developer-Salaries

1 2

Chapter 1

Getting Started

•

Do you have experience in other programming languages but want to pick up a high-level language?

•

Have you never programmed before but are eager to learn?

•

Did you take computer science courses previously, but they just didn’t help you learn how to create applications?

•

Do you want to make a career change?

•

Have you tried to learn languages previously but couldn’t because of the difficulty of the language?

•

Have you programmed in Python before but want to improve your abilities and learn new tools?

This book is designed for a wide array of readers, no matter your background. The real question is on you, “How hard are you willing to work?” The concepts taught in this book can benefit anyone willing to learn. Even if you’ve programmed in Python before, this book can still help you become a stronger developer

What You’ll Learn This book was created to be used for bootcamp classes designed in teaching Python. You can expect to cover necessary information that would be required of you on the job as a Python developer. These concepts will give you the ability to go forward with your education in programming. At the end of each chapter, we’ll use the concepts covered to create a variety of real-world applications. After all, we’re not just focused on Python here, we’re trying to build you up to become a better developer

Tomorrow, we’ll find out how to install the necessary software that this book uses. If you already have Anaconda and Python on your machine, you can skip to Wednesday’s lesson

Chapter 1

Getting Started

Tuesday. Setting Up Anaconda and Python Today, we’re going to get our software setup. Throughout this book we’ll be using a software platform called Anaconda, an integrated development environment [IDE] called Jupyter Notebook, and the language of Python itself. This book will strictly cover Python 3; however, at times you may see me mention subtle differences between versions 2 and 3. Let’s go ahead and download and install these first, then I’ll get into what each of them are

C ross-Platform Development Python runs on all major operating systems, making it a cross-platform language. This means that you can write code on one operating system and work with someone that uses a completely different machine than you. If both machines have Python installed, they should both be able to run the program

Installing Anaconda and Python for Windows Most OS X and Linux operating systems already come with Python installed; however, you still need to download Anaconda. For Windows users, Python usually isn’t included, but it gets installed with Anaconda. Use the following steps to install Anaconda properly. 1. Open your browser and type www. anaconda. com/distribution/. 2. Click the download button in the header [see Figure 1-1]

Figure 1-1. Anaconda Download Page 3. Once you are on the next page, make sure you select the proper operating system on the header at the top. Click that button [see Figure 1-2]. 6

Chapter 1

Getting Started

Figure 1-2. Selecting an operating system 4. Next, click the download button for the Python 3. 7 [or greater] section [see Figure 1-3]

Figure 1-3. Downloading Python 3. x version 5. This step is strictly for Windows users… Once the installer fully downloads, go ahead and run it. Use all defaults except for one option. When you get to the page in Figure 1-4, make sure you click the “add to path” option. This will let us access Anaconda through our terminal

Figure 1-4. Add to Path

Chapter 1

Getting Started

6. For all options [besides step 5 for Windows users], use default settings. Then go ahead and click the “Install” button and let Anaconda finish installing

What Is Anaconda? Anaconda is a Python and R distribution software. It aims to provide everything you need for Python “out of the box. ” Its primary use is for data analytics and data science; however, it’s a superb tool for learning as well. Upon downloading, it includes •

The core Python language and libraries

•

Jupyter Notebook

•

Anaconda’s own package manager

These are just a few features out of the many that Anaconda comes with; however, these are the ones we’ll be using throughout the book. The first feature in this list is the Python language and included packages that Python has access to. Libraries are pre- written code by another developer that you can use for your own benefit. The second feature is talked about in the next section. Lastly, Anaconda has a way of managing environments for us. This is a complex topic that we’ll get into in later weeks

What Is Jupyter Notebook? It is an open-source integrated development environment [IDE] that allows you to create and share documents that contain live code, equations, visualizations, and narrative text. For us, it’s essentially our notebook, where we will code along together. If you’re not familiar with IDEs, they are simply a tool for developers to code in. Think of them as a canvas for artists. It also allows you to write snippets of code without needing to know a lot about Python. We’ll get more into Jupyter Notebook for Thursday’s lesson

In today’s lesson, we installed Anaconda, Python, and Jupyter Notebook. Tomorrow, we’ll learn why and how to use the terminal

Chapter 1

Getting Started

Wednesday. How to Use the Terminal Depending on your operating system, you’re going to be using the Command Prompt [Windows] or the Terminal [Linux and OS X]. From this point forward, I’m going to refer to it as the “terminal,” so just keep that in mind if you’re on Windows. The terminal is a tool for users to be able to issue commands to the computer through basic text. For most of this book, we will use the terminal to either test our Python code or run Jupyter Notebook. Today we’ll be learning basic commands and how to use the Python shell. To get started, let’s open the terminal. As each operating system will look different, terminal sessions will be defined in code by the “$”. Any text you see after that symbol will be what you need to write into the terminal yourself

Changing Directories While inside the terminal, you’ll often want to move around from folder to folder. This gives you the power to navigate around your computer. It’s important to understand how to do this, as it’s always going to be what we do to start up Jupyter Notebook. In order to change directories, you need to type in “cd” followed by the folder name you wish to go to. $ cd desktop If you need to go backward, out of a folder, then you’ll want to use two dots [“. ”]. $ cd . Often, throughout this book, you’ll need to traverse through several directories to get into a project folder. When you use the “cd” command, you can go as far forward or backward as you select, you just need to specify the correct path to the folder you wish to go to. Take the following code, for instance… $ cd desktop/. /desktop We’re going into the desktop directory, but then going back out, only to go back into it. There’s nothing wrong with this; however, this is just an example that the computer will follow the path that you specify. Normally we would just cd into the desktop and be done. 9

Chapter 1

Getting Started

Checking the Directory To check the directory that you’re currently in, just look to the left of where you can write these lines of text. For Windows users, the directory you’re currently in will be the ending URL that you’re on, as marked in bold as follows. C. \Users\name\desktop> The last folder name is the “desktop,” which means that I’m currently in the directory for my desktop. If I were to create any files or folders, they would be created directly on there. To check which directory you’re in for Linux, it will be the name just to the left of the “$”. [email protected]. ~/Desktop$ For OS X users, it’ll be to the left of your username [who you’re logged in as]. User-Macbook-Pro. Desktop Name$

Making Directories Though it’s certainly okay to go into your file explorer, right-click, and select “create new folder,” it’s good to know how to create a new folder through the terminal session itself. Make sure that you’re in the “desktop” directory that we “cd” into previously. Then write the following line. $ mkdir python_bootcamp This will create a new folder called “python_bootcamp” on your desktop. We’ll be using this folder from here on out to store our lessons so that we stay organized

Creating Files Again, it’s easier to create files by going into your file explorer. However, sometimes we need to create files in terminal depending on the file type. Before we create a new file, however, let’s “cd” into our “python_bootcamp” folder that we created. $ cd python_bootcamp 10

Chapter 1

Getting Started

Now, for Windows users, we’ll need to type the following. $ echo. >example. txt Or if you’re on Linux/OSX. $ touch example. txt You should now be able to see the sample. txt file in file explorer

Note If you don’t see the “. txt ” extension, it’s because you don’t have “extensions” checked in your preferences within file explorer

Checking a Version Number The terminal is always a great way to check version numbers of certain software that we download. Since we already downloaded and installed Python, let’s run the following code. $ python --version

Clearing the Terminal Output Sometimes the terminal gets full of useless output or just becomes tough to read. When you want to clear the output, you need to write the following line [for Windows]. $ cls For Linux/OSX users, you’ll need to type in the following. $ clear

Chapter 1

Getting Started

Using the Python Shell Python is a language that requires what is called an “interpreter” to read and run the code we create. When the Python shell is activated, it acts as a local interpreter within the terminal session that is open. While it’s open, we can write any Python that we wish to execute. This is generally great for practicing small snippets of code, so that you don’t have to open an IDE and run an entire file. To start the Python shell up, while we are in the directory of “python_bootcamp”, simply type “python” and hit enter. The following will appear. $ python Python 3. 7. 0 [v3] Type "help", "copyright", "credits" or "license" for more information >>> The output will show the Python version you’re currently running. You’ll notice the three arrows [>>>], this means that you’re now working within the Python interpreter. While in the Python shell, everything you write is interpreted as the Python language. If for some reason you receive the following response. $ python 'python' is not recongized as an internal or external command, operable program or batch file. This means that Anaconda and Python were not installed properly. I’d advise you to go back to yesterday’s lesson and reinstall Anaconda following the step-by-step instructions given. You may need to restart your computer as well

Writing Your First Line of Python Up to this point, we haven’t done any programming. Generally, I’m against not diving right into coding myself; however, these basic setup instructions are crucial to getting started as a developer. Although we haven’t gone over any Python just yet, while the interpreter is still running, next to the arrows write the following code and hit enter. >>> print["Hello, buddy. "] 12

Chapter 1

Getting Started

There you go. You’ve just written your first line of Python and should see the following output. >>> print["Hello, buddy. "] Hello, buddy. >>>

Exiting the Python Shell Now, I’ll get to explaining what you just wrote in a later lesson, but for now let’s get out of the Python shell and finish today’s lesson by writing the following line and hitting enter. >>> exit[ ]

Today’s lesson was all about operating and understanding the terminal. This is an important skill for several developer positions, especially those that use Linux operating systems. Tomorrow we’ll discuss how to operate Jupyter Notebook

Thursday. Using Jupyter Notebook Jupyter Notebook is going to be where we spend most of our time throughout this book. It’s a powerful tool that is used in the data science community and makes it easier for us to learn Python because we can solely focus on writing code. Today’s lesson is all about how to use this tool, the cells, and how to open it

Note Each lesson will always ask you to open Jupyter Notebook, so keep this page handy in case you need to come back to it

Chapter 1

Getting Started

Opening Jupyter Notebook Jupyter Notebook can be opened through the Anaconda program; however, I want you to start getting used to the terminal and how to operate it, so we’re not going to open it through Anaconda. Instead, we’re going to do this through the terminal. The two benefits to this are •

Jupyter Notebook will open in the same directory that our terminal is in

•

Knowing how to use terminal will help you as a developer

If you still have the terminal session from yesterday open, skip the first step

Step 1. Open Terminal We need to open terminal and “cd” into our “python_bootcamp” directory. $ cd desktop/python_bootcamp

Step 2. Writing the Jupyter Notebook Command Opening Jupyter Notebook through the terminal is as simple as typing the name of the tool. $ jupyter notebook Be sure that you are in the proper directory before typing the code; otherwise it will open wherever your terminal directory is currently located. Often, this will open Jupyter Notebook up in your user folder. Jupyter Notebook will open in your browser

Creating a Python File Anytime we start a new week, we’ll end up creating a new file to work from. To do so, it’s simple; just click the “New” button on the right side of the screen when Jupyter Notebook first opens. Then select “Python 3” [see Figure 1-5]

Chapter 1

Getting Started

Figure 1-5. Creating a Python 3 notebook Once you click the “Python 3” option, a new tab will open as this file. Click the name at the top to rename it, and let’s name this file “Week_01” [see Figure 1-6]

Figure 1-6. Changing the file name

Jupyter Notebook Cells Now that we’ve opened up Jupyter Notebook and created a file that we can work with, let’s talk about cells. I’m not talking about biology; rather, in this notebook you’ll notice the empty white rectangle section below the tools [see Figure 1-7]. These are known as “cells. ”

Figure 1-7. Notebook cells highlighted in red 15

Chapter 1

Getting Started

Each cell is where we can write our code, or even use the Markup language. Let’s write some markup to begin with. 1. Click in the first cell, so the surrounding area glows blue. 2. In the toolbar, you’ll notice a drop-down menu that says “code. ” Click the drop-down, and select “markdown” instead. 3. Within the cell write the following. # Week 01

Note When writing markup, the number of hashtags in a row relates to the size of the heading. Like HTML header tags. 4. Let’s now run the cell to execute the code. To do this, you hold shift and press enter [the cell must be selected]. 5. When you use shift + enter, a new cell will appear below the current one. Within this newly created cell, let’s go ahead and write a simple line of Python to see how the output works. Let’s go ahead and write the following. # this is python print["Hello, buddy. "] Go ahead and run the cell. It will run all the code within the cell and output the result. Again, don’t worry about the actual Python, this lesson is about how Jupyter Notebook cells run. For the rest of this book, we’ll be writing our code inside of Jupyter Notebook files. I’ll be using markdown to specify certain sections, so be sure you’re comfortable with running cells, writing markdown, and creating a new Jupyter Notebook file before moving on

Chapter 1

Getting Started

Today we learned how to use Jupyter Notebook and what we can do with cells. In tomorrow’s lesson, we’ll build our first Python application

Friday. Creating Your First Program Every Friday will be known as “Project Day,” where we will build a small application or game together, which uses the concepts learned throughout the week. This week, however, I’m just going to have you write some code into a cell so that you can see the power of Python. Since we haven’t gone over any Python just yet, I wanted you to be able to experience what we will learn over the upcoming weeks. The code your about to write will use concepts from weeks 2, 3, and 4. By the end of these weeks, you’ll be able to fully understand each line of the following code and make your own tweaks to make the program more challenging. We’re going to be working from the Jupyter Notebook file from yesterday’s lesson. If you had closed out of the program since coming back to this book, go ahead and reopen the file

Note If you forgot how to open Jupyter Notebook, go back to yesterday’s lesson and redo the steps, except for creating a file

Line Numbers Introduced For larger projects, it becomes tough to follow along with books sometimes. For this project, and all other lessons going forward, I’ll be implementing line numbers. This will make it easier for you to follow along and check if you wrote the code correctly. 1. ← Line numbers will now appear on the left side of all cells, as we will need to write all this code within a single cell. Be sure to pay attention to these numbers, as you may see them jump a couple lines. 1. # this is the first line in the cell 5. # this is the fifth line in the cell 17

Chapter 1

Getting Started

This means that you should write the second line shown, on the 5th line

Note Turn lines on by pressing “L” after clicking the cell’s side

Creating the Program The first thing that we need to do is create a new cell below the current cell in our file. In order to do that, simply follow these steps. 1. Click the last cell in the file. 2. While it is highlighted, go to the “Insert” tab in the menu bar, and click “Insert Cell Below. ” We now have a cell to work with for our project. If you’d like to create a markdown cell that says “Guessing Game” as the header, feel free to look back at the previous lesson and how we did it before. Within that new cell, let’s go ahead and write the following code. 1. # guessing game 2. from random import randint 3. from IPython. display import clear_output 5. guessed = False 6. number = randint[0, 100] 7. guesses = 0 9. while not guessed. 10. ans = input["Try to guess the number I am thinking of. "] # use tab to indent 12. guesses += 1 14. clear_output[ ] 16. if int[ans] == number. 17. print["Congrats. You guessed it correctly. "] # use tab twice to indent twice 18. print[ "It took you { } guesses. ". format[guesses] ] 19. break 20. elif int[ans] > number. 18

Chapter 1

Getting Started

21. print["The number is lower than what you guessed. "] 22. elif int[ans] < number. 23. print["The number is greater than what you guessed. "] This program is not perfect by any means, but it’s certainly fun to try and guess the number that the computer is thinking of. Now, I know that this looks like a foreign language to you right now, but over the next couple of weeks, each line will begin to make sense. Eventually you’ll even be able to make your own changes and improvements to the game. What I want you to do now is run the cell and play the game. Begin to think like a developer, and ask yourself these questions while you play. •

What improvements can I make?

•

What makes the program crash?

•

What would I do better?

Don’t be afraid if you get an error, it’s all part of the growth of becoming a developer. The fun part about testing the code that you write is that you try to break it. As we go forward, I’ll challenge you with questions about why a line in the code works the way it does. When this happens, try to think about it for a couple minutes, even try to Google the answer. As a developer you’ll find a lot of what you do is Googling a problem. This is what separates good developers from great ones… the ability to figure out problems on their own. With the rest of the lessons in this book, you’ll be well on your way to figuring out problems without my help

Final Output All source code for each week will be located within the Github repository for this book. You may find the link to that repository in the front of the book. To find the specific code for this week, simply open or download the “Week_01. ipynb” file from the Github repository. If you ran into errors along the way, be sure to reference what you wrote with the code in this file to see where you went wrong

Today we were able to see our first Python program in action. Granted you may not understand what is going on, I believe it’s crucial that you see the power of Python. As we go forward, feel free to come back to this program and make your own improvements to it. After all the only way you get better is by doing. 19

Chapter 1

Getting Started

Weekly Summary I know this week was a bit slow, but it is a crucial week in the process. We covered how to download the necessary tools, how to use them, and how to use the terminal itself. These topics are important in understanding the content going forward and will help set you up for success. At the end of this week, we ended up programming a fun guessing game together, that I hope you tried to break and play around with. As a developer it’s important to want to break a program, so that you may improve it. In the upcoming week, the real fun begins. We’ll start to learn the basics of Python and eventually write a small program together

Weekly Challenges Each week will have its own challenges at the end that you should certainly try. Completing them will help in improving your programming skills. As this week was mostly about setting up, the following challenges won’t be about programming at all. All other weeks, however, will give you good examples to test your abilities. 1. New File. Create a new Jupyter Notebook file called “Week 1 – Challenges. ” You should now have two files within the main work folder. 2. Writing Markdown. Within the file from exercise 1, create a cell with markdown in it that says “Challenge 1. ” Try several different header sizes. Pick the one you like best. 3. Exploring Python. You should get used to Googling problems or topics that interest you. Try searching for Python topics that interest you, and keep them in mind as you begin to learn the language. 4. Motivating Yourself. Every programmer started from nothing. Each one became a great programmer from pushing themselves to learn the languages they were interested in. Figure out what motivates you to want to become a developer and write it down. Keep this in mind when you begin to struggle. 20

CHAPTER 2

Python Basics No matter what famous programmer you think of, like Bill Gates or Guido van Rossum, they also started at a basic level at some point in their life. These basic concepts are a necessity to build a foundation on which you can learn any programming language. After all, you don’t start building a house from the roof down, you need to have a foundation to work from. That’s where this week comes in to play. The focus this week will be on data types and variables. These are core concepts in just about any programming language. The beauty of learning a single language is that it allows you to pick up other languages easily. This is due in part that all languages follow the same core concepts. By the end of this week, you’ll be able to understand how to write simple programs on your own. A program such as the one that we’ll build together, where we will print information out to the user in a nicely formatted receipt. This week I also introduce your first challenge question. These questions are to ensure that you begin to “think like a developer. ” Some questions may not have definitive answers, but rather they’ll push you to create solutions and problem-solve. It’s important that you spend some time thinking about each question, so that you can begin to train your problem-solving skills. After all, it’s the most sought-after skill in every development industry. Overview •

Understanding data types

•

How to use variables

•

Seeing what you can do with strings

•

How to manipulate a string

•

Coding a program that prints receipts

Chapter 2

Python Basics

CHALLENGE QUESTION In programming, we have a concept called “algorithms. ” An algorithm is simply just a set of steps. Whether you know it or not, you’ve used algorithms throughout your life. A common algorithm is a recipe that you follow to make food. To think like a developer, you must begin to understand how a computer reads code. A computer is only as smart as the program that it’s supposed to execute. This means that even the smartest computers can fail if the steps aren’t correct. Let’s use a recipe to bake a cake, for instance. If we miss a single step or leave the cake in the oven for too long, then we fail, as would a computer that is missing a crucial step. Now, I’d like you to think about the steps for making a peanut butter and jelly sandwich. Write down your steps on a piece of paper. Try to think like a computer when you write them out and understand that you need to be as precise as possible. The answer will be at the end of this chapter

Monday. Comments and Basic Data Types Today marks your first lesson of the Python language. The two concepts taught today will help build that foundation that we’re striving for. To follow along with the content for today, let’s open up Jupyter Notebook from our “python_bootcamp” folder. If needed, go back to last week’s lesson on how to open up Jupyter Notebook. Once it’s open, create a new file, and rename it to “Week_02. ” Next, make the first cell markdown, with the following code. # Comments & Basic Data Types

What Are Comments and Why Use Them? Comments are like notes that you leave behind, either for yourself or someone else to read. They are not read in by the interpreter, meaning that you can write whatever you want, and the computer will ignore it. A good comment will be short, easy to read, and to the point. Putting a comment on every line is tedious, but not putting any comments at all is bad practice. As you program, you’ll begin to understand what that happy medium looks like. 22

Chapter 2

Python Basics

When you begin to write larger programs, you’ll want to leave notes for yourself. Too often have I created a program, stopped working on it for three weeks, and when I came back, I forget what I was working on. Leaving comments isn’t only good for yourself but also for others who will read your code. Think of comments as breadcrumbs that help you understand what’s going on

Writing Comments In Python, we can write comments using the hash [#] symbol. Any text that follows this symbol will be commented out. In the cell below our markdown header, let’s write our first comment. # this is a comment Let’s go ahead and run the cell. Notice that nothing happens. This is because the computer completely ignores any comments. For the most part, we’ll write comments on their own line; however, in certain instances you may see comments written in line with code. In the same cell as the previous comment, let’s add the following line. print["Hello"] # this is also a comment The first portion of this line will run and output “Hello”, but the second part will be ignored because of the hash symbol

Note Markdown uses hash characters for headers, like Python comments. Make sure you know what type your cell is set to “markdown/cell. ” To write multiline comments so that you may write more descriptive paragraphs for larger portions of code, we would need to use three opening and closing double quotes. " " " This is a multi-Line comment " " " print["Hello"] # this is also a comment

Chapter 2

Python Basics

Go ahead and run the cell. Notice that the text within the multiline comment gets ignored. These types of comments are great for adding descriptive paragraphs about your code. Be sure not to overuse them, however, as you can certainly make a mess of a program by using too many of them

What Are Data Types? Almost all languages use data types, they are essential to every program. Data types are how we define values, likes words or numbers. If I were to ask you what a sentence is made up of, you would probably reply with “words or characters. ” Well, in programming, we call them strings. Just the same as we refer to numbers as their own data type as well. Data types define what we can do and how these values are stored in memory on the computer. In Table 2-1, you’ll find that each row displays a data type, a sample value, and a description for each. Read each section for a longer explanation for each type. You can find the four basic types that we cover this week within the table

Table 2-1. Data type examples Data Types

Sample Value

Description

Integer

Whole numbers

Float

5. 7

Decimal numbers

Boolean

True

True or False values

String

“Hello”

Characters within quotes

The Print Statement Before we go any further, I just want to touch on the print statement. In almost every language, you need the ability to output information to the user; within Python we’re able to do this through the print statement. Now I don’t want to get too far in depth, but the print statement is what we call a function in Python. We will cover functions during the entire fifth week. For now, though, just know that the print statement allows us to output information to the user. The way it works is by writing the keyword “print” followed by parenthesis. Whatever is inside of the parenthesis will be output for the user to see. 24

Chapter 2

Python Basics

Integers These data types are often called integers or ints. They are positive or negative WHOLE numbers with no decimal point. Integers are used for a variety of reasons, between math calculations and indexing [which we'll get into later]; they are a main data type in any language. Let’s go ahead and print a couple examples out in the next cell of our file. # the following are all integers print[2] print[10] Go ahead and run the cell. The resulted output should be a series of numbers 2 and 10

Floats Anytime a number has a decimal point on it, they’re known as floating point data types. It doesn’t matter if it has 1 digit, or 20, it’s still a float. The primary use of floats is in math calculations, although they have other uses as well. Let’s check out an example. # the following are all floats print[10. 953] print[8. 0] # even this number is a float Go ahead and run the cell. The output should be a series of numbers 10. 953 and 8. 0

Note The number “8. 0” is considered a float, because it includes a decimal point

Booleans The boolean data type is either a True or False value. Think of it like a switch, where it’s either off or on. It can’t be assigned any other value except for True or False. Booleans are a key data type, as they provide several uses. One of the most common is for tracking whether something occurred. For instance, if you took a video game and wanted to know if a player was alive, when the player spawned initially, you would set a boolean 25

Chapter 2

Python Basics

to “True”. When the player lost all their lives, you would set the boolean to “False”. This way you can simply check the boolean to see if the player is alive or not. This makes for a quicker program rather than calculating lives each time. Let’s go ahead and run the following. # the following are booleans print[True] print[False] Go ahead and run that cell. The output should be the words True and False, respectively

Strings Also known as “String Literals,” these data types are the most complex of the four that we go over today. The actual definition of a string is

Strings in Python are arrays of bytes representing unicode characters. To most beginners, that’s just going to sound like a bunch of nonsense, so let’s break it down into something simple that we can understand. Strings are nothing more than a set of characters, symbols, numbers, whitespace, and even empty space between two sets of quotation marks. In Python we can use either single or double quotes to create a string. Most of the time it’s personal preference, unless you want to include quotes within a string [see line 3 in the next block]. Whatever is wrapped inside of the quotation marks will be considered a string, even if it’s a number. Let’s go ahead and write some examples in the next cell for strings. # the following are strings print[" "] print["There's a snake in my boot. "] print['True'] The output will include an empty line at the top, as we print out nothing in the first statement

Chapter 2

Python Basics

MONDAY EXERCISES 1. Output. Print out your name. 2. Type Checking. Try checking the type of a value by using the type[] method. This will always print out what kind of data type you’re checking. This is useful to check data types when you’re unsure. As an example. >>> type[int] # will output

Today, we focused on the four essential data types in Python. Understanding the difference between each is key as we move forward. In tomorrow’s lesson, we will begin to understand how to save these data types to be used later in the program

Tuesday. Variables Variables are one of the most important beginner-level concepts in programming. They allow us to save values into memory using a name that we assign. This lets us use those values later in the program. Yesterday’s lesson covered different data types, but what if you wanted to save one of those data types to use later? This works like how we store information in our brain, variables are stored in computer memory, and we can access them later by referencing the name we used. I won’t go into the theory behind how Python stores information, as we’re focusing more on the application of programming, but it’s worth noting that Python automatically handles memory storage and garbage collection for us. To follow along with this lesson, let’s continue from our previous notebook file “Week_02” and simply add a markdown cell at the bottom that says “Variables. ”

How They Work We declare a name on the left side of the equals operator [“=”], and on the right side, we assign the value that we want to save to use later. Take the following example [no need to write this]. >>> first_name = "John" 27

Chapter 2

Python Basics

When you create a variable, the line where you assign the value is a step called declaration. We’ve just declared a variable with a name of “first_name” and assigned it the value of the string data type “John”. This string is now stored in memory, and we’re able to access it by calling the variable name “first_name”

Note Variable names can contain only letters, underscores, and numbers; however, they cannot start with a number

Handling Naming Errors All programmers make mistakes, so it’s not a problem if you run into errors. It just comes with the job. Let’s look at a common mistake that occurs with variables [no need to write this]. >>> Sport = 'baseball' # capital 'S' >>> print[sport] # lowercase 'S' If we try to run this code, we’ll get the following error/output. NameError. name 'sport' is not defined This is because the names are completely different. We referenced a variable with a lowercase “s” but declared one with capital “S. ” To fix this we would capitalize the “s” in sport within print

Integer and Float Variables To store an integer or float in a variable, we give a name to the left of the operator and write a number on the right side. In the next cell, let’s go ahead and write the following code. num1 = 5 # storing an integer into a variable num2 = 8. 4 # storing a float into a variable print[num1, num2] # you can print multiple items using commas Go ahead and run that cell. Notice the output is 5 and 8. 4, even though we print out “num1” and “num2. ” We’re printing out the value that is stored in those variables. 28

Chapter 2

Python Basics

Boolean Variables Remember that booleans are True or False values, so storing them is as simple as typing in one of those two words. Let’s write the following. # storing a boolean into a variable switch = True print[switch] Go ahead and run that cell. The resulted output is “True”. Notice that in Jupyter Notebook, the value of True or False will glow green. This is a good indication if we wrote it correctly

String Variables Strings are as easy to store as the previous three data types. Just keep in mind that the use of single or double quotes matters. Let’s go ahead and write the following code in a new cell. # storing strings into a variable name = 'John Smith' fav_number = '9' print[name, fav_number] # will print 9 next to the name Go ahead and run that. Remember that the string “9” is not the same as the integer 9. These two data types act differently, even though the output looks similar

Using Multiple Variables In almost any program you’ll write, you’re going to need to perform some calculations or manipulation on variables. In the following code, we access the values from previously declared variables and add them together to create a sum. Make sure that the previous cells have been run before running this cell. Let’s go ahead and put this in a new cell. # using two variables to create another variable result = num1 + num2 print[result] 29

Chapter 2

Python Basics

After running this cell, you’ll notice that it added 5 and 8. 4 together to output 13. 4

Note If you get an error saying that a variable doesn’t exist, try running the cell where that variable is declared first

Using Operators on Numerical Variables Think of Python as a calculator, where we can alter any variables we want. In the following code, we alter the “result” variable defined previously. # adding, deleting, multiplying, dividing from a variable result += 1 # same as saying result = result + 1 print[result] result *= num1 # same as saying result = result * num1 print[result] Go ahead and run the cell. In the first line, we added 1 to the result, then later we multiplied it by the value of “num1,” which is 5. All the while, the computer saved the result variable so we could continue to edit it. Then we print the result, which comes out to 72. 0

Overwriting Previously Created Variables Python makes it easy for us to change the value of a variable, by simply re-declaring it. In some languages you would have to define the data type, but Python handles all of that for us. We’ve seen this occur with the preceding result variable, but it’s worth noting in its own cell. # defining a variable and overwriting it's value name = 'John' print[name] name = 'Sam' print[name]

Chapter 2

Python Basics

Go ahead and run that in a new cell. You’ll notice that the output shows “John” and “Sam”. The location of when you access or re-declare your variables matter; keep that in mind

Whitespace Whitespace just means characters which are used for spacing and have an “empty” representation. In the context of python, it means tabs and spaces. For example. >>> name = 'John Smith' There’s whitespace to the left and right of the equals operator. It’s not required, but it makes reading the code easier. The computer simply ignores whitespace when compiling the code. Within the string, however, the space is NOT whitespace, this is simply a “spacing” character

TUESDAY EXERCISES 1. Variable Output. Store the value 3 in a variable called “x” and the value 10 in a variable called “y”. Save the result of x * y into a separate variable called “result”. Finally, output the information so it shows like the following. >>> 3 + 10 = 13

2. Area Calculation. Calculate the area of a 245. 54” x 13. 66” rectangle. Print out the result. HINT. Area is width multiplied by height

Variables are used everywhere, and Python makes it easy for us to incorporate them. Being able to store information is a key part of any program. Tomorrow we’ll look at how we can manipulate strings

Wednesday. Working with Strings It’s important to understand what you can do with string data types. The next two days cover working with and manipulating strings so that we may build a receipt printing program at the end of the week. We won’t worry about taking in user input but rather how to format strings, what a string index is, etc. 31

Chapter 2

Python Basics

To follow along with this lesson, let’s continue from our previous notebook file “Week_02” and simply add a markdown cell at the bottom that says, “Working with Strings. ”

String Concatenation When we talk about concatenating strings, I mean that we want to add one string to the end of another. This concept is just one of many ways to add string variables together to complete a larger string. For the first example, let’s add three separate strings together. # using the addition operator without variables name = "John" + " " + "Smith" print[name] Go ahead and run that cell below the markdown cell. The output we get is “John Smith”. We ended up adding two strings that were names and separated them with the use of a string with a space inside. Let’s go ahead and try to store the two names into variables first. # using the addition operator with variables first_name = "John" last_name = "Smith" full_name = first_name + " " + last_name print[full_name] Go ahead and run that cell. We get the exact same output as the previous cell; however, we used variables to store the information this time

Formatting Strings Earlier we created a full name by adding multiple strings together to create a larger string. While this is perfectly fine to use, for larger strings it becomes tough to read. Imagine that you had to create a sentence that used 10 variables. Appending all ten variables into a sentence is tough to keep track of, not to mention read. We’ll need to use a concept called string formatting. This will allow us to write an entire string and inject the variables we want to use in the proper locations. 32

Chapter 2

Python Basics

. format[ ] The format method works by putting a period directly after the ending string quotation, followed by the keyword “format”. Within the parenthesis after the keyword are the variables that will be injected into the string. No matter what data type it is, it will insert it into the string in the proper location, which brings up the question, how does it know where to put it? That’s where the curly brackets come in to play. The order of the curly brackets is the same order for the variables within the format parenthesis. To include multiple variables in one format string, you simply separate each by a comma. Let’s check out some examples. # injecting variables using the format method name = "John" print[ "Hello { }". format[name] ] print[ "Hello { }, you are { } years old. ". format[name, 28] ] Go ahead and run that cell. We’ll see that the output in the first line is “Hello John” and the second “Hello John, you are 28 years old”. Keep in mind that the format function will inject variables and even data types themselves. In this instance, we injected the integer value 28

f Strings [New in Python 3. 6] The new way to inject variables into a string in Python is by using what we call f strings. By putting the letter “f” in front of a string, you’re able to inject a variable into a string directly in line. This is important, as it makes the string easier to read when it gets longer, making this the preferred method to format a string. Just keep in mind you need Python 3. 6 to use this; otherwise you’ll receive an error. To inject a variable in a string, simply wrap curly brackets around the name of the variable. Let’s look at an example. # using the new f strings name = "John" print[ f"Hello {name}" ] Go ahead and run the cell. We get the same output that we had gotten with the . format[] method; however, it’s much easier to read the code this time. 33

Chapter 2

Python Basics

Note Throughout this book, we’ll be using the . format[] method

Formatting in Python 2 Python 2 doesn’t include the . format[] method; instead you would use percent operators to mark the location of the variable being injected. The following is an example to inject the variable “name” into the location of “%s”. The letter after the percent operator signifies the data type. For integers, you would use “%d” for digit. After the string closes, you would place a percent operator, followed by the variables you would like to use. Let’s look at an example. # one major difference between versions 2 & 3 name = 'John' print['Hello, %s' % name] Go ahead and run that cell. You’ll notice that we get the same output as the previous methods. If you wanted to format a string in Python 2 with multiple variables, then you would need to write the following. # python 2 multiple variable formatting first_name = "John" last_name = "Smith" print[ "Hello, %s %s" % [first_name, last_name] ] # surround the variables in parenthesis Go ahead and run the cell. We’ll get the output “Hello, John Smith”. When passing multiple variables, you need to surround the variable names within parenthesis and separate each by a comma. Notice there are also two symbols within the string that represent the location of each respective variable in order from left to right

String Index One other key concept that we need to understand about strings is how they are stored. When a computer saves a string into memory, each character within the string is assigned what we call an “index. ” An index is essentially a location in memory. Think of 34

Chapter 2

Python Basics

an index as a position in a line that you’re waiting in at the mall. If you were at the front of the line, you would be given an index number of zero. The person behind you would be given index position one. The person behind them would be given index position two and so on

Note Indexing in most languages, including Python, starts at 0 not 1. The same is true for Python strings. If we take a string like “Hello” and break down their indexes [see Figure 2-1], we can see that the letter “H” is located at index zero. Let’s try an example. # using indexes to print each element word = "Hello" print[ word[ 0 ] ] # will output 'H' print[ word[ 1 ] ] # will output 'e' print[ word[ -1 ] ] # will output 'o' In order to index a specific element, you use square brackets to the right of the variable name. Within those square brackets, you put the index location you wish to access. In the preceding case, we’re accessing the first two elements in the string “Hello” stored in the variable “word”. The last line accesses the element in the last position. Using negative index numbers will result in trying to access information from the back, such that -4 would result in the output of the letter “e”

Figure 2-1. Index locations for a string Be very careful when working with indexes. An index is a specific location in memory. If you try to access a location that is out of range, you will crash your program because it’s trying to access a place in memory that does not exist. For example, if we tried to access index 5 on the “Hello”. 35

Chapter 2

Python Basics

String Slicing I want to just quickly introduce the topic of slicing. Slicing is used mostly with Python lists; however, you can use it on strings as well. Slicing is essentially when you only want a piece of the variable, such that if I only wanted “He” from the word “Hello”, we would write the following. print[ word[ 0 . 2 ] ] # will output 'He' The first number in the bracket is the starting index; the second is the stopping index. We will touch on this concept in a later week; however, feel free to mess around with slicing. Before the day ends though, I’d like to quickly cover the start, stop, and step arguments when slicing. The syntax for slicing is always >>> variable_name[ start . stop . step ] In the previous cell, we only included the start and stop because the step is optional and defaults to incrementing by one each time. However, what if we wanted to print every other letter. print[ word[ 0 . 5 . 2 ] ] # will output 'Hlo' Go ahead and run the cell. By passing the step as the number two, it increments the index by two each time instead of one. We will cover this more in depth in a later chapter; for now let this be an introduction into slicing with all three arguments

WEDNESDAY EXERCISES 1. Variable Injection. Create a print statement that injects an integer, float, boolean, and string all into one line. The output should look like “23 4. 5 False John”. 2. Fill in the Blanks. Using the format method, fill in the following blanks by assigning your name and favorite activities into variables. "{ }'s favorite sports is { }. " "{ } is working on { } programming. " 36

Chapter 2

Python Basics

We covered some key concepts when working with strings today, formatting and indexing. Tomorrow we’ll use other methods that will help us manipulate strings

Thursday. String Manipulation In many programs that you’ll build, you’re going to want to alter strings in one way or another. String manipulation just means that we want to alter what the current string is. Luckily, Python has plenty of methods that we can use to alter string data types. To follow along, let’s continue from our previous notebook file “Week_02” and simply add a markdown cell at the bottom that says, “Manipulating Strings. ”

. title[ ] Often, you’ll run into words that aren’t capitalized that should be usually names. The title method capitalizes all first letters in each word of a string. Try the following. # using the title method to capitalize a string name = "john smith" print[ name. title[ ] ] Go ahead and run that cell. The output we get is a “John Smith” with capital letters on each word. This method is great for formatting names correctly

Note Try using name. lower[ ] and name. upper[ ] and see what happens

. replace[ ] The replace method works like a find and replace tool. It takes in two values within its parenthesis, one that it searches for and the other that it replaces the searched value with. # replacing an exclamation point with a period words = "Hello there. " print[ words. replace[ ". ", ". " ] ] 37

Chapter 2

Python Basics

Go ahead and run that cell. This will result in an output of “Hello there. ”

Note For the replace to be stored properly afterward, we would have to re- declare our words variable. words = words. replace[‘. ’, ‘. ’]

. find[ ] The find method will search for any string we ask it to. In this example, we try to search for an entire word, but we could search for anything including a character or a full sentence. # finding the starting index of our searched term s = "Look over that way" print[ s. find["over"] ] Go ahead and run that cell. You’ll notice that we got an output of 5. Find returns the starting index position of the match. If you count where the word “over” begins, the “o” is at index location 5. This is important when you want to access a specific index on a search

. strip[ ] In cases where you want to get rid of a certain character on the left and right side of a string, you would use the strip method. By default, it will remove spaces. Let’s try running the following. # removing white space with strip name = " john " print[ name. strip[ ] ] The output will produce “john” because we’ve removed all the spaces on the left and right side

Note Try . lstrip[ ] and . rstrip[ ] and see what happens. 38

Chapter 2

Python Basics

. split[ ] I won’t go into too much detail with split simply because what it returns is a list and we haven’t covered those quite yet; however, I wanted you to see how to use this method. What it does is separate the words in the sentence into a group of words, stored within a list. Now don’t worry about lists just yet, we’ll get there. For now, let’s just see how this method works. # converting a string into a list of words s = "These words are separated by spaces" print[ s. split[" "] ] Go ahead and run the cell. The output results in a list of words “[‘These’, ‘words’, ‘are’, ‘separated’, ‘by’, ‘spaces’]”. We’ll come back to this method and why it’s important

THURSDAY EXERCISES 1. Uppercasing. Try manipulating the string “uppercase” so it prints out as all uppercase letters. You’ll need to look up a new method. 2. Strip Symbols. Strip all the dollar signs from the left side of this string “$$John Smith”. Try it with . lstrip[ ] and . strip[ ]. To see a description on how to use the strip method further, try using the help function in Python by typing the following. >>> help[" ". strip]

Today you learned a handful of manipulation methods, but there are many more. Try experimenting with others that you find on the Web

Friday. Creating a Receipt Printing Program Welcome to your first project. We’ll be creating a very basic receipt printing program. For this week, as we’ve learned about variables, operators, and string manipulation, we’ll be using these skills in order to create this program. 39

Chapter 2

Python Basics

To follow along, let’s continue from our “Week_02” notebook and simply add a markdown cell at the bottom that says, “Friday Project. Printing Receipts. ”

Final Design It’s always good to picture the design of what you’re trying to build. For larger projects, you’ll want to create a flow chart or some sort of design document that will keep you on track. This way you don’t sway from the intended result. For us, we’ll be building a small receipt printing program with the concepts we’ve learned, in which the output will look like Figure 2-2

Figure 2-2. End result of Friday project Let’s begin, shall we

Initial Process Whenever you begin a project, you must always understand where to start. No matter the size of the project, there are certain dependencies. Like building a house, you must have a foundation before you can put the roof on. Now, this program will be around 50 lines and have little to no dependencies, so we’ll start with the top border and work our way down to the bottom. 40

Chapter 2

Python Basics

Defining Our Variables In the cell below our markdown header, let’s begin to define the variables that we’ll be working with throughout this program. 1. 2. 3. 4

# create p1_name, p2_name, p3_name,

a product and price for three items p1_price = "Books", 49. 95 p2_price = "Computer", 579. 99 p3_price = "Monitor", 124. 89

I always like to introduce new concepts while building out these Friday projects, as it’s good to implement good coding techniques. The technique introduced within this block is the ability to declare multiple variables on the same line. To do so, we simply separate the variable names and their associated values by a comma. Looking at the first two variables declared, the value of “Books” will be saved into the variable name “p1_name”, and the value “49. 95” will be saved into the variable name “p1_price”. Rather than writing six lines, we’ve reduced our program by half already. The less lines we use the better [most times]. Variables such as x and y, or in our case, a name and price, are good examples of declaring variables associated together in one line. Next, let’s define the variables we’ll be using for the company at the top of the receipt. All the code for this project may be done in a single cell, or you can separate the cells. It’s up to you. I’ve provided line numbers in case you follow along on a single cell. 6. 7. 8. 9

# create a company name and information company_name = "coding temple, inc. " company_address = "283 Franklin St. " company_city = "Boston, MA"

As an example, we’ve left the company name all lowercase so that we can use a string manipulation method to fix this issue. Lastly, let’s declare the message that we’ll output to the user at the bottom of the receipt. 11. # declare ending message 12. message = "Thanks for shopping with us today. " Go ahead and run the cell. Now that we’ve defined all our variables, we can move on. 41

Chapter 2

Python Basics

Creating the Top Border As we can see from the design that we’ve laid out at the beginning of this project, we’ll need to print out a border on the top and bottom. Let’s start with the top border. 14. # create a top border 15. print[ "*" * 50 ] Go ahead and run the cell. There’s a new concept being applied here, where we write “*” * 50. All we’re trying to do is print out 50 stars in a row for a top border, and rather than making 50 print statements, we can simply multiply the string by the number we want. This way we get our top border while keeping our code slim and easy to read. Readability of code is always key

Displaying the Company Info We’ve already defined our variables for the company in the preceding lines, so let’s display them. 17. 18. 19. 20

# print company information first, using format print[ "\t\t{ }". format[ company_name. title[ ] ] print[ "\t\t{ }". format[company_address] ] print[ "\t\t{ }". format[company_city] ]

Go ahead and run the cell. These print statements may seem a little hard to understand at first; however, I’m introducing an escape character to you. Escape characters are read in by the defining backslash “\” character. Whatever comes after that backslash is what the computer will interpret. In the three print statements, we use “\t” for a tab indentation. Another popular escape character you may see is “\n” which means newline and acts as if you hit the enter key. We use two escaping characters in a row to center it within our output. Let’s create a divider. 22. # print a line between sections 23. print[ "=" * 50 ]

Chapter 2

Python Basics

Go ahead and run the cell. Like how we printed out our top border, we’ll multiply the equal symbol by 50 to create the same width line. This will give the appearance of separate sections

Displaying the Product Info Looking at our original design, we want to create a header before we list out each product’s name and price. This can be done simply by using our escaping characters for indenting. 25. # print out header for section of items 26. print["\tProduct Name\tProduct Price"] Go ahead and run the cell. Due to the size of the header names, we only need to use a single tab before each header. Now we can go ahead and output a row for each products’ information. 28. 29. 30. 31

# create a print statement for each product print[ "\t{ }\t\t${ }". format[p1_name. title[ ], p1_price] ] print[ "\t{ }\t\t${ }". format[p2_name. title[ ], p2_price] ] print[ "\t{ }\t\t${ }". format[p3_name. title[ ], p3_price] ]

Go ahead and run the cell. We’re using similar styles as the previous print statements in order to center each product’s title and price under their respective headers. Try not to get too confused by all the symbols within the print string; you can simply break them down to a tab, followed by the first variable being formatted into the string, followed by two tabs, followed by a dollar sign [in order to make the price look like currency], and followed by the second variable being formatted into the string. This completes the section for our items, so let’s put in another section divider. 33. # print a line between sections 34. print['=' * 50] Go ahead and run the cell. This will set us up for our next section to display the total

Chapter 2

Python Basics

Displaying the Total Like the products section, we want to create a header for our total, but we want to also center it underneath the price column of the products section. To do so, we’ll use three tabs. 36. # print out header for section of total 37. print["\t\t\tTotal"] Go ahead and run the cell. Now that we have our total header aligned with the price column in products, we can output our total on the next line. Before we can print out a total, however, we must first calculate the total, which is the sum of all our products. Let’s define a variable called total and then print it out. 39. # calculate total price and print out 40. total = p1_price + p2_price + p3_price 41. print[ "\t\t\t${ }". format[total] ] Go ahead and run the cell. Again, we’ve gone ahead and added three tabs, plus a dollar sign to make the total value appear as currency. Let’s now add a section border. 43. # print a line between sections 44. print[ "=" * 50] Go ahead and run the cell to make sure it looks like the desired output so far

Displaying the Ending Message To display the final thank you message, our design has it spaced out slightly more than any other section, so we’ll need to add a couple of newlines to give it some extra spacing. 46. # output thank you message 47. print[ "\n\t{ }\n". format[message] ] Go ahead and run the cell. Our message is now centered, and we’re ready to move on. 44

Chapter 2

Python Basics

Displaying the Bottom Border To finish off this simple printing program, we need to throw in a bottom border for aesthetics. 49. # create a bottom border 50. print[ "*" * 50 ] Go ahead and run the cell one last time

Congratulations. As simple as it may be, it’s a huge milestone. After learning more material, try coming back here to improve it

Weekly Summary This week we went over some very important foundational concepts in programming with variables and working with strings. You must always keep in mind that variables need to be declared before you can use them and that the name associated is saved in memory with the value on the right side of the equals operator. Strings are easy to work with in Python, as the language has a variety of methods that we can call in order to do the work for us. At the end of the week, we were able to build a simple receipt printing program. Try breaking the program. I always encourage students to try and break programs because it will teach you how to fix it

Challenge Question Solution There isn’t a definitive solution to making a PB&J sandwich, but I want you to go back and see if you weren’t specific enough. Computers are only as smart as we program them to be, so if you said to put the peanut butter on the bread, it may just interpret it as putting the entire jar on the bread instead. As a developer you need to be specific with your descriptions. Even try rewriting a new algorithm with improved steps

Chapter 2

Python Basics

Weekly Challenges To test out your skills, try these challenges. 1. Side Borders. In the Friday project, we ended up creating borders above and below the information printed out. Try adding a star border on the sides as well now. 2. Researching Methods. We’ve gone over a few of the string manipulation methods that are widely used; however, there are many more; try looking up some and implementing them. 3. Reverse. Declare a variable equal to “Hello”. Reverse the string using slicing. Try looking it up if you struggle

Tip You can define a start, stop, and step when slicing

CHAPTER 3

User Input and Conditionals Welcome to Week 3. This week we’ll be introducing how to work with user input and making decisions within our programs. These “decisions” are known as branching statements or conditionals. If you think of your life every day, you make decisions based on specific conditions without knowing, such as when to get up in the morning, what to have for lunch, when to eat, etc. These are known as branching statements. The same applies in programming, where we need to have the computer make decisions. Overview •

Working with user input

•

How to use “if” statements to make decisions

•

How to use “elif” statements to make multiple decisions

•

How to use “else” statements to make decisions no matter what

•

Building a calculator with decision-making and user input

CHALLENGE QUESTION This week’s challenge is to test your ability to read code. I want you to read the code block and think about whether it will work or not. If you believe it will not work, I want you to make a note of why it won’t. It’s important to be able to both read and write. >>> print['{} is my favorite sport'. format[element]] >>> element = 'Football'

After you’ve written down your answer, go ahead and run the code within a cell. If your answer was incorrect, try to analyze where you why. The answer will be at the end of this chapter. © Connor P. Milliken 2020 C. P. Milliken, Python Projects for Beginners, https. //doi. org/10. 1007/978-1-4842-5355-7_3

Chapter 3

User Input and Conditionals

Monday. User Input and Type Converting In today’s lesson we’ll introduce the ability to interact with the user and a concept called type conversion. These will be necessary to understand how to build the calculator at the end of the week. To follow along with the content for today, let’s open up Jupyter Notebook from our “python_bootcamp” folder. Once it’s open, create a new file, and rename it to “Week_03. ” Next, make the first cell markdown that has a header saying. “User Input & Type Converting. ” We’ll begin working underneath that cell

Accepting User Input In many programs we’ll be creating, you’ll need to accept user input. To do so, we need to use the input[] function. Like the print function, input will print the string inside of the parenthesis, but it will also create a box for the user to enter information. Let’s look at an example. # accepting and outputting user input print[ input["What is your name? "] ] Go ahead and run that cell. You’ll notice that the cell will output whatever you write within the box. When the interpreter comes across the input function, it will pause until enter is pressed

Note Information entered is taken into the program as a string

Storing User Input In the previous cell, we simply printed out the input that the user put in. However, in order to work with the data that they enter, we need to store it into a variable. # saving what the user inputs ans = input["What is your name? "] print["Hello { }. ". format[ans] ] 48

Chapter 3

User Input and Conditionals

Go ahead and run that cell. Storing the information that the user puts in our program is as easy as storing it into a variable. This way we can work with the data they input at any point

What Is Type Converting? Python defines type conversion functions to directly convert one data type to another which is useful in day-to-day and competitive programming. In some situations, the data you’re working with may not be the correct type. The most obvious example is user input because no matter what the user types in, the input is taken as a string. If you are expecting a number to be input, you’ll need to convert the input to an integer data type, so that you’re able to work with it

Checking the Type Before we go over how to type convert, I’d like to touch on an important function that Python has which allows us to check the type of any given variable. # how to check the data type of a variable num = 5 print[ type[num] ] Go ahead and run that cell. The output here will be “”. Don’t worry about the class portion here, we’ll get into classes another week. Focus on the second part where it outputs the type as an integer. This allows us to check what data type where working with

Converting Data Types Python gives us the ability to type convert easily from one type to another simply by wrapping the type around the variable. Let’s check out an example of converting a string to an int. # converting a variable from one data type to another num = "9" num = int[num] # re-declaring num to store an integer print[ type[num] ] # checking type to make sure conversion worked 49

Chapter 3

User Input and Conditionals

Go ahead and run that cell. We’ve just converted the string of “9” to an integer. Now we can use the variable num in any calculations. For the conversion to process correctly, we used the int[] type conversion. Whatever data type is put inside of the parenthesis is converted into an int. Check Table 3-1 for how to convert from one data type to another

Table 3-1. Converting data types Current Type

Data Value

Converting to

Proper Code

Output

Integer

String

str[9]

'9'

Integer

Float

float[5]

5. 0

Float

5. 6

Integer

int[5. 6]

String

‘9’

Integer

int['9']

String

‘True’

Boolean

bool['True']

True

Boolean

True

Integer

int[True]

As you can see, there are several ways to type convert; you just need to use the keyword for each defining data type. The boolean type of True converts to an integer of 1 because the True and False values represent 1 and 0, respectively. Also, converting a float to an integer will just truncate the decimal, as well as any numbers to the right of the decimal

Note Not all data types can be converted properly. There are limits

Converting User Input Let’s try working with a user’s input in order to add 100 to whatever they type. # working with user input to perform calculations ans = input["Type a number to add. "] print[ type[ans] ] # default type is string, must convert result = 100 + int[ans] print[ "100 + { } = { }". format[ans, result] ] 50

Chapter 3

User Input and Conditionals

Go ahead and run that cell. Inputting the number “9” will give us a proper result; however, this conversion would not work well with the word “nine” because the default return type for input is a string as noted by the first print statement in this cell

Handling Errors In the last cell, we convert the user input to an integer; however, what if they put in a word instead? The program would break right away. As a developer, we must assume that the user won’t put the proper information that we expect them to. To handle this issue, we’re going to introduce try and except blocks. Try and except are used to catch errors. It works by trying to run what is inside the try block; if it doesn’t produce an error, then it continues without hitting the except block; however, if an error occurs, then the code in the except block runs. This is to make sure your program doesn’t stop running if an error pops up. This is a generic way to handle errors; there are many other methods like using the functions isalpha[] and isalnum[]. Let’s look at an example using the try and except blocks. # using the try and except blocks, use tab to indent where necessary try. ans = float[ input["Type a number to add. "] ] print[ "100 + { } = { }". format[ans, 100 + ans] ] except. print["You did not put in a valid number. "] # without try/except print statement would not get hit if error occurs print["The program did not break. "] Go ahead and run that cell. Try inputting different answers including non-numbers. You’ll notice that our nonvalid print statement will output if you don’t input a number. If we didn’t have the try and except in place, the program would break, and the last print statement wouldn’t occur

Chapter 3

User Input and Conditionals

Code Blocks and Indentation In most other programming languages, indentation is used only to help make the code look pretty. For Python though, it is required for indicating a block of code. Let’s take our previous code from the “Handling Errors” section. The two lines after our try statement are indented and are known as blocks of code. These lines belong to the try statement because they are directly indented after the statement. The same goes for our other print statement within the except block. It’s the reason that our nonvalid print statement only runs if the except block runs. All blocks of code need to be connected to a statement; you can’t indent a section randomly

Note The indents must be consistent. It does not always need to be four spaces; however, a tab is four spaces, so it's usually easier to indent with tabs. MONDAY EXERCISES 1. Converting. Try converting a string of “True” to a boolean, and then output its type to make sure it converted properly. 2. Sum of Inputs. Create two input statements, and ask the user to enter two numbers. Print the sum of these numbers out. 3. Car Information. Ask the user to input the year, make, model, and color of their car, and print a nicely formatted statement like “2018 Blue Chevrolet Silverado. ”

Today was an important step in covering user input, how to convert from one data type to another, and how to handle errors

Tuesday. If Statements Today we’ll learn all about how to make decisions in our code. This will give us the ability to have our programs decide what lines of code to run, depending on what the user inputs, calculations, etc. This is the most important lesson of this week. Be sure to spend a good amount of time going on today’s lesson. 52

Chapter 3

User Input and Conditionals

To follow along with this lesson, let’s continue from our previous notebook file “Week_03” and simply add a markdown cell at the bottom that says, “If Statements. ”

How They Work Every day you make hundreds of decisions. These decisions define what you do with your day. In programming these are known as branching statements or “if statements. ” An if statement works the same way that a decision is made. You check a condition, and if that condition is true, you perform the task, and if it’s not true, then you move on without performing that task

“Am I hungry?” “Yes, so I should make some food. ” *** proceeds to cook food *** The same decision-making process can be implemented in programming using an if statement

Writing Your First If Statement All branching statements begin the same way, with the keyword “if”. Following the keyword is what is known as a condition. Lastly, there will always be an ending colon at the end of the statement. The if statement checks to see if the given condition is True or False. If the condition is True, then the code block runs. If it is False, then the program continues without running any of the code indented directly after the if statement. Let’s try it out. # using an if statement to only run code if the condition is met x, y = 5, 10 if x < y. print["x is less than y"] Go ahead and run that cell. Notice here that the output is “x is less than y”. This is because we originally declared x equal to 5 and y equal to 10 and then used an if statement to check if x was less than y, which it was. If x was equal to 15, then the print statement indented after the “if” would have never ran, because the condition would have been False. 53

Chapter 3

User Input and Conditionals

C omparison Operators Before we continue with branching statements, we need to go over comparison operators. So far, we’ve used arithmetic operators for adding and subtracting values and assignment operators for declaring variables, and with the introduction of the “if statement,” we’ve now seen comparison operators. There are several comparisons that you’re able to make. Most comparison operators that you’ll use, however, are shown in Table 3-2

Table 3-2. Comparison operators Operator

Condition

Functionality

Example

Equality

if x == y

if x is equal to y …

Inequality

if x . = y

if x does not equal y…

Greater than

if x > y

if x is greater than y…

Greater or equal

if x >= y

if x is greater or equal to y…

y. print["x is greater"] elif [x + 10] < y. # checking if 15 is less than 10 print["x is less"] elif [x + 5] == y. # checking if 10 is equal to 10 print["equal"] Go ahead and run that cell. The resulting output is “equal”. The first if and elif statements both returned False, but the second elif statement returned True, which is why that block of code ran. You can have as many elifs as you want, but they must be associated with an if statement. 59

Chapter 3

User Input and Conditionals

Note Within the conditional, we perform addition, but we wrap it within parenthesis so that it executes the math operation first

Conditionals Within Conditionals We’ve gone over how Python uses indentation to separate blocks of code. So far, we’ve only seen one indentation level, but what if we added an if statement within an if statement? # writing multiple conditionals within each other - multiple block levels x, y, z = 5, 10, 5 if x > y. print["greater"] elif x >> name = "John" >>> if name == "Jack". >>> print["Hello Jack"] >>> elif. >>>> print["Hello John"]

2. User Input. Ask the user to input the time of day in military time without a colon [1100 = 11. 00 AM]. Write a conditional statement so that it outputs the following

a. “Good Morning” if less than 1200

b. “Good Afternoon” if between 1200 and 1700

c. “Good Evening” if equal or above 1700

Today we learned all about else statements. You’re now able to build programs that can generate code given a condition

Friday. Creating a Calculator Last week we built a receipt printing program together. With the lessons learned from this week, we’re going to be building a simple calculator that accepts user input and outputs the proper result. To follow along with this lesson, let’s continue from our previous notebook file “Week_03” and add a markdown cell at the bottom that says, “Friday Project. Creating a Calculator. ”

Chapter 3

User Input and Conditionals

Final Design For each week we always want to lay out the final design. As this week is based around the logic rather than how it looks, we’ll lay out the steps necessary to build our calculator. 1. Ask the user for the calculation they would like to perform. 2. Ask the user for the numbers they would like to run the operation on. 3. Set up try/except clause for mathematical operation

a. Convert numbers input to floats

b. Perform operation and print result

c. If an exception is hit, print error

Step #1. Ask User for Calculation to Be Performed For each one of these steps, let’s put the code in separate cells. This will allow us to section of the specific steps for our project, making it easier to test each step. The first step is to ask the user to input the mathematical operation to be performed [add, subtract, etc. ]. # step 1. ask user for calculation to be performed operation = input["Would you like to add/subtract/multiply/divide? "]. lower[ ] print[ "You chose { }. ". format[operation] ] # for testing purposes Go ahead and run that cell. Depending on what the user inputs, your output will print what they chose. You’ll notice that on the line where we accept the input, we also convert it to lowercase right away. This is to avoid case-sensitive issues later. Our print statement is simply for testing purposes on this cell only and will be removed later

Chapter 3

User Input and Conditionals

Step #2. Ask for Numbers, Alert Order Matters In the cell below step #1, we’ll need to create the next step of our logic. Here, we ask the user to input a couple of numbers and output those numbers for testing purposes. # step 2. ask for numbers, alert order matters for subtracting and dividing if operation == "subtract" or operation == "divide". print[ "You chose { }. ". format[operation] ] print["Please keep in mind that the order of your numbers matter. "] num1 = input["What is the first number? "] num2 = input["What is the second number? "] print[ "First Number. { }". format[num1] ] # for testing purposes print[ "Second Number. { }". format[num2] ] # for testing purposes Go ahead and run that cell. Notice that we put in a print statement alerting the user that if they chose subtraction or division, the order of numbers matters. This is important as num1 will always be on the left side of the operator [in our program], which makes a huge difference

Note Rerun the previous cell if you get an error for undefined

Step #3. Set Up Try/Except for Mathematical Operation The third, and final step, is to try performing the operation. The reason for setting up a try/except block here is because we must convert the user’s input to floating data types. We must assume that they may not enter the proper input. Let’s see how this cell will work. # step 3. setup try/except for mathematical operation try. # step 3a. immediately try to convert numbers input to floats num1, num2 = float[num1], float[num2] # step 3b. perform operation and print result if operation == "add". result = num1 + num2 66

Chapter 3

User Input and Conditionals

print[ "{ } + { } = { }". format[num1, num2, result] ] elif operation == "subtract". result = num1 - num2 print[ "{ } - { } = { }". format[num1, num2, result] ] elif operation == "multiply". result = num1 * num2 print[ "{ } * { } = { }". format[num1, num2, result] ] elif operation == "divide". result = num1 / num2 print[ "{ } / { } = { }". format[num1, num2, result] ] else. # else will be hit if they didn't chose an option correctly print["Sorry, but '{ }' is not an option. ". format[operation] ] except. # steb 3c. print error print["Error. Improper numbers used. Please try again. "] Go ahead and run that cell. There’s a lot going on here so let’s start from the top. We set up a try block and immediately convert the user’s input to floats. If this causes an error, the except clause will be hit and output that an error occurred rather than the program breaking. If the input can be converted, then we set up an if/elif/else statement to perform the calculation and output the proper result. If they didn’t input a proper operation, then we let them know. This cell is dependent on the previous two. If you’re getting errors, rerun the previous cells

Final Output Now that we’ve created the logic for our program in three separate cells, we can now put it all together in one. Let’s remove all the testing print statements. You can essentially take all the code from the three cells and paste them into one cell, resulting in the following

Chapter 3

User Input and Conditionals

# step 1. ask user for calculation to be performed operation = input["Would you like to add/subtract/multiply/divide? "]. lower[ ] # step 2. ask for numbers, alert order matters for subtracting and dividing if operation == "subtract" or operation == "divide". print[ "You chose { }. ". format[operation] ] print["Please keep in mind that the order of your numbers matter. "] num1 = input["What is the first number? "] num2 = input["What is the second number? "] # step 3. setup try/except for mathematical operation try. # step 3a. immediately try to convert numbers input to floats num1, num2 = float[num1], float[num2] # step 3b. perform operation and print result if operation == "add". result = num1 + num2 print[ "{ } + { } = { }". format[num1, num2, result] ] elif operation == "subtract". result = num1 - num2 print[ "{ } - { } = { }". format[num1, num2, result] ] elif operation == "multiply". result = num1 * num2 print[ "{ } * { } = { }". format[num1, num2, result] ] elif operation == "divide". result = num1 / num2 print[ "{ } / { } = { }". format[num1, num2, result] ] else. # else will be hit if they didn't chose an option correctly print["Sorry, but '{ }' is not an option. ". format[operation] ] except. # steb 3c. print error print["Error. Improper numbers used. Please try again. "]

Chapter 3

User Input and Conditionals

Go ahead and run that cell. Now you’re able to run a single cell to get our program to work from start to finish. It’s not perfect, but it gives you the ability to perform simple calculations. As always, try to break the program, change a line around, and make it your own

Congratulations on finishing another project. As simple as this calculator may be, we have shown the ability to use logic, take user input and convert it, and check for errors

Weekly Summary What a week. We’ve just seen how we can interact with our user and be able to perform branching statements. This will allow us to build projects with logic, which will perform specific code based on information that the program is using. The biggest concepts to remember here are our conditional statements and try/except blocks. It’s important to know the difference between catching an error and an error causing your program to crash. We always want to catch errors when possible to sure up our program. Next week we’ll learn about loops and how we can continuously run blocks of code over and over until we no longer want to

Challenge Question Solution If you were to run the code block for the challenge question, you would find that it produces an error. This is because we try to access our “element” variable before it’s declared. If you were to reverse these two lines, the program would work as desired

Weekly Challenges To test out your skills, try these challenges. 1. Reversing Numbers. Alter the calculator project so that the order of the numbers doesn’t matter. There are a few ways to get the same result; one way is to ask the user if they’d like to reverse the placement of the numbers. 69

Chapter 3

User Input and Conditionals

2. Age Group. Ask the user to input their age. Depending on their input, output one of the following groups

a. Between 0 and 12 = “Kid”

b. Between 13 and 19 = “Teenager”

c. Between 20 and 30 = “Young Adult”

d. Between 31 and 64 = “Adult”

e. 65 or above = “Senior” 3. Text-Based RPG. This is an open-ended exercise. Create a textbased RPG with a story line. You take user input and give them a couple choices, and depending on what they choose, they can go down a different path. You’ll use several branching statements depending on the length of the story

CHAPTER 4

Lists and Loops Throughout this week, I’ll be introducing a new data type called “lists” and a new concept called “loops. ” Lists will give us the ability to store large sets of data, while loops will allow us to rerun sections of our code. These two topics are being introduced together because lists work well with loops. Even though lists are one of the most important data types in Python, we needed to understand the basics of data types and branching statements before introducing them. By the end of the week, we’ll have the tools necessary to build a small-scale hangman game. We’ll use all the concepts that we’ve learned from previous weeks and this week. Through application and repetition, you’ll be able to understand each concept further each time it’s introduced. If you don’t get a concept just yet, it’s important to keep pushing through and try not to get stuck on a single lesson. Overview •

Understanding list data types

•

How and why to use for loops

•

How and why to use while loops

•

Understanding how to work with lists

•

Creating Hangman together

CHALLENGE QUESTION Imagine that you’re the mayor of a major city. For this example, let’s assume that the major city is Boston, MA. You’ve just been alerted that you need to evacuate the city. What do you do first?

Chapter 4

Lists and Loops

Monday. Lists Today we’ll be introducing one of the most important data types in Python, the list. In other languages, they are also known as “arrays” and have similar characteristics. This is the first data collection that you learn. We’ll see other data collection types in later weeks. To follow along with the content for today, let’s open up Jupyter Notebook from our “python_bootcamp” folder. Once it’s open, create a new file, and rename it to “Week_04. ” Next, make the first cell markdown that has a header saying. “Lists. ” We’ll begin working underneath that cell

What Are Lists? A list is a data structure in Python that is a mutable, ordered sequence of elements. Mutable means that you can change the items inside, while ordered sequence is in reference to index location. The first element in a list will always be located at index 0. Each element or value that is inside of a list is called an item. Just as strings are defined as characters between quotes, lists are defined by having different data types between square brackets [ ]. Also, like strings, each item within a list is assigned an index, or location, for where that item is saved in memory. Lists are also known as a data collection. Data collections are simply data types that can store multiple items. We’ll see other data collections, like dictionaries and tuples, in later chapters

Declaring a List of Numbers For our first list, we’re going to create a list filled with only numbers. Defining a list is like any other data type; on the left of the operator is the name of the variable, and on the right is the value. The difference here is that the value is a set of items declared between square brackets. This is useful for storing similar information, as you can easily pass around one variable name that stores several elements. To separate each item within a list, we simply use commas. Let’s try. # declaring a list of numbers nums = [5, 10, 15. 2, 20] print[nums] 72

Chapter 4

Lists and Loops

Go ahead and run that cell. You’ll get an output of [5, 10, 15. 2, 20]. When a list is output, it includes the brackets with it. This current list is made up of three integers and one float

Accessing Elements Within a List Now that we know how to define a list, we need to take the next step and understand how to access items within them. In order to access a specific element within a list, you use an index. When we declare our list variable, each item is given an index. Remember that indexing in Python starts at zero and is used with brackets. Wednesday of Week 2 also covers indexing. # accessing elements within print[ nums[1] ] # num = nums[2] # print[num] #

a list will output the value at index 1 = 10 saves index value 2 into num prints value assigned to num

Go ahead and run that cell. We’ll get two values output here, 10 and 15. 2. The first value is output because we’re accessing the index location of 1 in our nums list, which has an integer of 10 stored there. The second value was printed out after we created a new variable called num, which was set to the value stored at index 2 within our nums list

Declaring a List of Mixed Data Types Lists can hold any data type, even other lists. Let’s check out an example of several data types. # declaring a list of mixed data types num = 4. 3 data = [num, "word", True] # the power of data collection print[data]

Chapter 4

Lists and Loops

Go ahead and run that cell. This will output [4. 3, ‘word’, True]. It outputs 4. 3 as the first item because when the list is defined, it stores the value of num, not the variable itself

Lists Within Lists Let’s get a little more complex and see how lists can be stored within another list. # understanding lists within lists data = [5, "book", [ 34, "hello" ], True] # lists can hold any type print[data] print[ data[2] ] Go ahead and run that cell. This will output [5, ‘book’, [34, ‘hello’], True] and [34, ‘hello’]. The first output is the entire data variable’s value, which stores an integer, a string, a list, and a boolean data type. The second output is the list stored inside of our data variable, which is located at index 2 and includes an integer and string data type

Accessing Lists Within Lists In the last cell, we saw how to output the list stored within the data variable. Now, we’ll see how we can access the items within the inner list. To access items within a list normally, we simply use bracket notation and the index location. When that item is another list, you simply add a second set of brackets after the first set. Let’s check out an example and come back to it. # using double bracket notation to access lists within lists print[ data[2][0] ] # will output 34 inner_list = data[2] # inner list will equal [34, 'hello'] print[ inner_list[1] ] # will output 'hello' Go ahead and run that cell. The first output will be 34. This is because our first index location is accessing the second index in data, which is a list. Then the second index location specified is accessing the value in that list at location zero, which results in the integer of 34. The second output is “hello”. We get this result because we declared a 74

Chapter 4

Lists and Loops

variable to store the value at index 2 of our data variable, which happens to be a list. Our inner_list variable is now equal to [34, ‘hello’] and we access the value at index 1, which is the string “hello”. To get a little bit more understanding of how multi-indexing works, check out Table 4-1

Table 4-1. Multi-indexing values Index Location

Value at Location

Data Type

Can Be Indexed Again

Integer

‘book’

String

Yes

[34, ‘hello’]

List

Yes

True

Boolean

Notice that strings can also be index further. If you wanted to only print out the “b” in “book,” you would simply write the following. >>> print[ data[ 1 ][ 0 ] ] # will output 'b'

Changing Values in a List When you work with lists you need to be able to alter the value of the items within the list. It’s like re-declaring a normal variable to a different value, except you access the index first. # changing values in a list through index data = [5, 10, 15, 20] print[data] data[0] = 100 # change the value at index 0 - [5 to 100] print[data] Go ahead and run that cell. Before we altered the value at index 0, it outputs [5, 10, 15, 20]. Once we accessed the zero index and changed its value to 100, however, the list ended up changing to [100, 10, 15, 20]

Chapter 4

Lists and Loops

Variable Storage When variables are declared, the value assigned is put into a location in memory. These locations have a specific reference ID. It’s not often you’ll need to check the ID of a variable, but for educational purposes, it’s good to know how storage works. We would use the id[] function to check the storage location in memory for a variable. >>> a = [ 5, 10 ] >>> print[ id[a] ] # large number represents location in memory When a list is stored in memory, each item is given its own location. Changing the value using index notation will change the value stored within that memory block. Now, if a variable’s value is another variable, like so. >>> a = [5, 10] >>> b = a Changing the value at a specific index will change the value for both lists. Let’s see an example. # understanding how lists are stored a = [5, 10] b = a print[ "a. { }\t b. { }". format[a, b] ] print[ "Location a[0]. { }\t Location b[0]. { }". format[ id[a[0]], id[b[0]] ] ] a[0] = 20 # re-declaring the value of a[0] also changes b[0] print[ "a. { }\t b. { }". format[a, b] ] Go ahead and run that cell. We’re going to get several outputs here. The first is printing out the values of both list variables to show that they have the same values. The second print statement will output the location in memory for each list’s first item. Then lastly, after we change the value of the first item within our “a” list, the value in our “b” list also changes. This is because they share the same memory location

Chapter 4

Lists and Loops

Copying a List So how do you create a similar list without altering the original? You copy it. Let’s see how. # using [. ] to copy a list data = [5, 10, 15, 20] data_copy = data[ . ] # a single colon copies the list data[0] = 50 print[ "data. { }\t data_copy. { }". format[data, data_copy] ] Go ahead and run that cell. The output this time will result in only our data variable having the first item set to 50. As data_copy was merely a copy of the list, now we’re able to always keep the original list in tact if we need to use it again

Note You can also use the method . copy[ ]

MONDAY EXERCISES 1. Sports. Define a list of strings, where each string is a sport. Then output each sport with the following line “I like to play {}”… 2. First Character. For the following list, print out each item’s first letter. [output should be ‘J’, ‘A’, ‘S’, ‘K’] names = [‘John’, ‘Abraham’, ‘Sam’, ‘Kelly’]

Today was all about our first data collection type, the list. There was a lot to cover, but it’s important to understand how to define, change values, and make copies of lists

Chapter 4

Lists and Loops

T uesday. For Loops Today will be spent covering a crucial concept in programming, loops. In most applications, you’re going to need the ability to run the same code more than once. Rather than writing the same lines of code several times, we use loops. In Python there are two types of loops, today’s lesson will be on “For Loops. ” To follow along with this lesson, let’s continue from our previous notebook file “Week_04” and simply add a markdown cell at the bottom that says “For Loops. ”

How Loops Work Loops are how programmers rerun the same lines of code several times. Loops will always run until a condition is met. Take a first-person shooter, the game will continue to run until either you’ve won, or your health reaches zero. Once either of those conditions occur, the game ends

Note It’s always important to condense your code down to as few lines as possible, as it is more efficient for the program. Whether you know it or not, loops are everywhere in life. Every day we wake up, go to work, and go to bed, we know it as a routine, but it’s simply a loop. We repeat the same process each day until we reach the weekend. The same concept is applied to the loops in our programs

W riting a For Loop For loops are primarily used to loop a set number of times. Take Figure 4-1, for instance, this syntax suggests that the loop will run five times. Let’s break this down further. Every for loop begins with the keyword “for”. Then you define a temporary variable, sometimes known as a counter or index. Next is the “in” keyword, followed by the range function [which will be explained later]. Lastly, we have a colon to end the statement. All for loops will follow this exact structure of keyword, variable, keyword, function, and colon

Chapter 4

Lists and Loops

Figure 4-1. For loop syntax Now that we’ve talked about the structure of writing a for loop, let’s write one. # writing your first for loop using range for num in range[5]. print[ "Value. { }". format[num] ] Go ahead and run that cell. This will output “0, 1, 2, 3, 4” for our values. This loop is essentially counting to five and printing out each number. So how does it print out each number? When the for loop is created, the range function begins at zero by default and assigns the value of zero into our temporary variable num. Each time through the loop is what we call an iteration. For each iteration, once all the code within the block runs, the current iteration is finished, and the loop starts over again at the top. Except this time, it increments the value of num, which by default is 1. Our temporary variable is assigned the value of 1 and continues to run the lines of code inside the for loop, which is simply printing out the value of num. It will continue to do this until we reach the number 5. To give you an idea of the values assigned for each iteration, reference Table 4-2

Table 4-2. Values assigned for each iteration using range[ ] Loop Iteration

Value of Num

Output

Value. 0

Value. 1

Value. 2

Value. 3

Value. 4 79

Chapter 4

Lists and Loops

Note The value 5 is not output because range[ ] counts up to but not including

Range[ ] Range allows us to count from one number to another while being able to define where to start and end and how much we increment or decrement by. Meaning that we could count every other number or every fifth number if we wanted to. When used with a for loop, it gives us the ability to loop a certain number of times. In the previous example, we saw that a range of 5 printed out five numbers. This is because range defaults to starting at 0 and increments by 1 each time. Let’s see another example. # providing the start, stop, and step for the range function for num in range[2, 10, 2]. print[ "Value. { }". format[num] ] # will print all evens between 2 and 10 Go ahead and run that cell. This time we’ve specified our program to start the loop at the value of 2 and count to 10 but increment by 2. The output for our values becomes “2, 4, 6, 8”

Looping by Element When working with data types that are iterable, meaning they have a collection of elements that can be looped over, we can write the for loop differently. # printing all characters in a name using the 'in' keyword name = "John Smith" for letter in name. print[ "Value. { }". format[letter] ]

Chapter 4

Lists and Loops

Go ahead and run that cell. The output will be each letter printed out one at a time. Remember that strings can be indexed and are a collection of characters or symbols, which makes them iterable. This for loop will iterate over each character and run the code within the block with that character/symbol. Table 4-3 goes over the first few iterations of this loop

Table 4-3. Iteration values for looping over strings with range Loop Iteration

Value of Letter

Output

Value. J

Value. o

Value. h

Value. n

space symbol

Value

Value. S

C ontinue Statement Now that we’ve seen how a loop works, let’s talk about a few important statements that we can use with loops. The first is the continue statement. Once a continue statement is hit, the current iteration stops and goes back to the top of the loop. Let’s see an example. # using the continue statement within a foor loop for num in range[5]. if num == 3. continue print[num] Go ahead and run that cell. The output will result in “0, 1, 2, 4” because the continue statement is only read when num is equal to the value of 3. Once the statement is hit, it stops the current iteration and goes back to the top to continue looping on the next iteration. This completely stops the code below continue from being interpreted, so it doesn’t hit the print statement

Chapter 4

Lists and Loops

Break Statement One of the most important statements we can use is the break statement. It allows us to break out of a loop at any point in time. Let’s see an example. # breaking out of a loop using the 'break' keyword for num in range[5]. if num == 3. break print[num] Go ahead and run that cell. The output will result in “0, 1, 2” because we broke the loop completely when num was equal to 3. Once a break is read, the loop completely stops, and no more code within the loop is run. These are useful for stopping a loop when a condition is met

Note If you use a double loop, the break statement will only break out of the loop that the statement is within. Meaning, it will not break out of both loops if the break statement is used within the inner loop

Pass Statement The last of these three statements is pass. The pass statement is simply just a placeholder so that the program doesn’t break. Let’s see an example. # setting a placeholder using the 'pass' keyword for i in range[5]. # TODO. add code to print number pass Go ahead and run that cell. Nothing happens, but that’s a good thing. If you take the pass statement out completely, the program will break because there needs to be some sort of code within the block

Chapter 4

Lists and Loops

It’s simply there so that we don’t have to write code within the loop just yet. It’s useful for framing out a program

Note Using “TODO” is general practice for setting a reminder

TUESDAY EXERCISES 1. Divisible by Three. Write a for loop that prints out all numbers from 1 to 100 that are divisible by three. 2. Only Vowels. Ask for user input, and write a for loop that will output all the vowels within it. For example. >>> "Hello" ➔ "eo"

Today was spent learning all about for loops and how they work. Looping allows us to run the same lines of code several times

Wednesday. While Loops We’ll be going over the other type of loop today, the while loop. Yesterday we saw how loops work, and why we would use a for loop. A while loop is generally used when you need to loop based on a condition rather than counting. Today will be all about condition-based looping. To follow along with this lesson, let’s continue from our previous notebook file “Week_04” and simply add a markdown cell at the bottom that says, “While Loops. ”

Chapter 4

Lists and Loops

Writing a While Loop Like a for loop, the while loop starts out with the keyword “while”. Following that, we have a conditional like we would use to write an if statement. Let’s see an example. # writing your first while loop health = 10 while health > 0. print[health] health -= 1 # forgetting this line will result in infinite loop Go ahead and run that cell. This will continue to print out the value of health until the condition is met. In this case, once health is no longer greater than zero, the loop stops running. On the last line, we decrement health by one, so each iteration reduces health closer to zero. If we didn’t decrement health at any point in time, this would become an infinite loop [which is bad]

While vs. For I’ve explained a few times now why we would use each loop; however, it’s always good to reiterate concepts. For loops are generally used when you need to count or iterate over a collection of elements. While loops are generally used when doing condition-based looping. When using a while loop, often you’ll use boolean variables. Each loop has their use cases; in most cases it’s personal preference, but the general rule of thumb is counting with for loops, conditions with while loops

Note The pass, break, and continue statements all work the same way for while loops as well

Infinite Loops In a previous cell, I mentioned that infinite loops were bad. An infinite loop will continue to run until the program breaks, the computer is shut down, or until time stops. Knowing this, stay away from creating infinite loops. Here is an example of an infinite loop. 84

Chapter 4

Lists and Loops

>>> game_over = False >>> while not game_over. >>> print[game_over] If you were to run this within a cell, eventually you would have to shut down Jupyter Notebook and restart it [or at least the kernel]. This is because the game_over variable never becomes True, and the condition is running until game_over becomes True. Always make sure you have a way to exit your loops, whether it be by a break or by a condition

N ested Loops The concept of a loop within a loop is what we call a nested loop. The concepts of a loop still apply. When using nested loops, the inner loop must always finish running, before the outer loop can continue. Let’s see an example. # using two or more loops together is called a nested loop for i in range[2]. # outside loop for j in range[3]. # inside loop print[ i, j ] Go ahead and run that cell. At first, this may seem a bit confusing, since there’s a lot going on here. Let’s break the output down with Table 4-4

Table 4-4. Tracking nested loop values Iteration

Value of i

Value of j

Inner Loop Count

Outer Loop Count

Chapter 4

Lists and Loops

In total we can see that the inner loop runs six times and the outer loop runs twice. The value of i only increments when the outer loop runs, which doesn’t occur until the inner loop finishes. The inner loop must count from 0 to 3 each time to run the next iteration on the outer loop

WEDNESDAY EXERCISES 1. User Input. Write a while loop that continues to ask for user input and runs until they type “quit”. 2. Double Loop. Write a for loop within a while loop that will count from 0 to 5, but when it reaches 3, it sets a game_over variable to True and breaks out of the loop. The while loop should continue to loop until game_over is True. The output should only be 0, 1, 2

Today was a bit of a shorter day, as the concept of loops is the same whether it’s a while or for. Remember that a while loop is used for conditional looping, while we use a for loop for counting/iterating

Thursday. Working with Lists Now that we’ve learned what lists are and how to use loops, we’re going to go over how to work with lists today. Lists are an important key to any program in Python, so we need to understand our capabilities when using them. To follow along with this lesson, let’s continue from our previous notebook file “Week_04” and simply add a markdown cell at the bottom that says, “Working with Lists. ”

Chapter 4

Lists and Loops

Checking Length Often, we’ll need to know how many items are within a list. To do so, we use the len[] function. # checking the number of items within a list nums = [5, 10, 15] length = len[nums] # len[] returns an integer print[length] Go ahead and run that cell. This will output 3. We use the length function for several uses, whether it’s checking for an empty list or using it within the range function to loop a list

Slicing Lists A few weeks back we talked about slicing a string. Lists work the same way so that you’re able to access specific items. Slicing follows the same arguments as the range function start, stop, step. # accessing specific items of a list with slices print[ nums[ 1 . 3 ] ] # will output items in index 1 and 2 print[ nums[ . 2 ] ] # will output items in index 0 and 1 print[ nums[ . . 2 ] ] # will print every other index - 0, 2, 4, etc. print[ nums[ -2 . ] ] # will output the last two items in list Go ahead and run that cell. The outputs are shown in the comments next to each statement. We use bracket notation as if we’re accessing an index; however, we separate the other values via a colon. The order is always [start . stop . step]. By default, start is zero and step is one. You have the option to leave those values out if you’d like to keep the defaults. Using a negative number for the step position will result in slicing backward. If you use a negative number in the start or stop positions, then the slice will either start or stop further from the back. Meaning that if you state -5 as the stop position, it will slice from the start of the list all the way to five elements before the list ends

Chapter 4

Lists and Loops

Adding Items When you need to add items to your lists, Python has two different methods for doing so

. append[ ] Append will always add the value within the parenthesis to the back of the list. Let’s see. # adding an item to the back of a list using append nums = [10, 20] nums. append[5] print[nums] # outputs [10, 20, 5] Go ahead and run that cell. We declared a list with two items in it to start and then added the integer value of 5 to the back of the list

. insert[ ] The second method to add items to a list is using insert. This method requires an index to insert a value into a specific location. Let’s see an example. # adding a value to the beginning of the list words = [ "ball", "base" ] nums. insert[0, "glove"] # first number is the index, second is the value Go ahead and run that cell. The output will result in [‘glove’, ‘ball’, ‘basex’]. Glove is in the zero index now because we specified that index within our insert method

Removing Items There are several ways to remove items from a list, the following are the main two methods

Chapter 4

Lists and Loops

. pop[ ] By default, the pop method removes the last item in the list; however, you can specify an index to remove as well. This method is also widely used to save the removed item too. When pop is used, it not only removes the item but also returns it. This allows us to save that value into a variable to be used later. # using pop to remove items and saving to a variable to use later items = [5, "ball", True] items. pop[ ] # by default removes the last item removed_item = items. pop[0] # removes 5 and saves it into the variable print[removed_item, "\n", items] Go ahead and run that cell. Using pop, we can see that it removed the True item first, then the element in index zero, which happens to be the integer 5. While popping it out of the list, we saved it into a variable, which we later output along with the new list

. remove[ ] The remove method allows us to remove items from a list based on their given value. # using the remove method with a try and except sports = [ "baseball", "soccer", "football", "hockey" ] try. sports. remove["soccer"] except. print["That item does not exist in the list"] print[sports] Go ahead and run that cell. Here we’ll see that the output is our sports list without soccer because we were able to remove it correctly. Now the reason why we use a try and except with the removal is because if “soccer” didn’t exist in the list, then the program would crash

Chapter 4

Lists and Loops

Working with Numerical List Data Python provides a few functions for us to use on lists of numerical data, such as min, max, and sum. There are several more that we can use, though these are used most frequently. # using min, max, and sum nums = [5, 3, 9] print[ min[nums] ] # will find the lowest number in the list print[ max[nums] ] # will find the highest number in the list print[ sum[nums] ] # will add all numbers in the list and return the sum Go ahead and run that cell. The output will result in 3, 9, and 17. As their names state, they’ll find the minimum and maximum number. The sum function will simply add all the numbers up

Sorting a List Often, you’ll need to work with a sorted list. There are a couple methods for doing so, but they are very different. One will change the original list, while the other returns a copy

sorted[ ] The sorted function will work on either numerical or alphabetical lists, but not one that is mixed. Sorted also returns a copy of the list, so it doesn’t alter the original. Usually if you need to keep the original intact, be sure to use this function. # using sorted on lists for numerical and alphabetical data nums = [5, 8, 0, 2] sorted_nums = sorted[nums] # save to a new variable to use later print[nums, sorted_nums] # the original list is in tact Go ahead and run that cell. You’ll notice the output of our nums list is still in the original order when we declared it. To use the new sorted list, we simply save it to a new variable

Chapter 4

Lists and Loops

. sort[ ] The sort method is used for the same purpose that our previous sorted function is used for; however, it will change the original list directly. # sorting a list with . sort[] in-place nums = [5, 0, 8, 3] nums. sort[ ] # alters the original variable directly print[nums] Go ahead and run that cell. The resulted output will be a properly sorted list. Just remember that the nums variable is now changed, as . sort[] changes the value directly

Conditionals and Lists When working with lists, often you’ll need to check if values exist. Now we’ll introduce how to run conditional statements on a list. There are many reasons to run a conditional on a list; these are simply a couple examples

Using “in” and “not in” Keywords We’ve seen the use of these keywords already, when we covered conditional statements last week. When working with lists, they serve a purpose to find values within the list quickly. # using conditional statements on a list names = [ "Jack", "Robert", "Mary" ] if "Mary" in names. print["found"] # will run since Mary is in the list if "Jimmy" not in names. print["not found"] # will run since Jimmy is not in the list Go ahead and run that cell. The output results in “found” and “not found”. On the first statement, we were trying to see if “Mary” existed in the list, which it does. The second conditional statement checked to see if “Jimmy” did not exist, which is also true, so it too runs. 91

Chapter 4

Lists and Loops

Checking an Empty List There are so many reasons to need to check for an empty list. It’s usually to ensure you don’t cause any errors in your program, so let’s see how we can check. # using conditionals to see if a list is empty nums = [ ] if not nums. # could also say 'if nums == []' print["empty"] Go ahead and run that cell. This will output “empty”. It’s mentioned in the comment, but we could have also checked to see if it were equal to empty brackets. Here, I wanted to show you how to use the “not” keyword. To check for a list with items, you would write the following. >>> if nums

Loops and Lists You can use both the for and while loops to iterate over the items within a list

Using For Loops When iterating over a list with a for loop, the syntax looks like when we used the range function previously; however, this time we use a temporary variable, the in keyword, and the name of the list. For each iteration, the temporary variable is assigned the item’s value. Let’s try it out. # using a for loop to print all items in a list sports = [ "Baseball", "Hockey", "Football", "Basketball" ] for sport in sports. print[sport] Go ahead and run that cell. Here we can see that this cell will output each item within the list. During the first iteration, the temporary variable “sport” is assigned “Baseball,” and once it prints it out, it moves on to the next item. 92

Chapter 4

Lists and Loops

Using While Loops While loops are always used for conditional looping. One great use case for a while loop with lists is removing an item. There are so many uses, this is just one of them. # using the while loop to remove a certain value names = [ "Bob", "Jack", "Rob", "Bob", "Robert" ] while "Bob" in names. names. remove["Bob"] # removes all instances of 'Bob' print[names] Go ahead and run that cell. The output will be our names list without “Bob” in the list. We used the combination of the while loop with a conditional to check for our “Bob” value in the list and then continued to remove it until our condition was no longer true

THURSDAY EXERCISES 1. Remove Duplicates. Remove all duplicates from the list below. Hint. Use the . count[ ] method. The output should be [‘Bob’, ‘Kenny’, ‘Amanda’] >>> names = ['Bob', 'Kenny', 'Amanda', 'Bob', 'Kenny']

2. User Input. Use a while loop to continually ask the user to input a word, until they type “quit ”. Once they type a word in, add it to a list. Once they quit the loop, use a for loop to output all items within the list

Today was important so that we could understand how to work with lists, whether it be a conditional statement or a loop. There are many methods out there that lists can use; we’ll go over more of them throughout the rest of this book

Friday. Creating Hangman As the weeks go on, the projects will generally get longer. Today we’re going to be building Hangman with the use of all the concepts learned from the past four weeks. As usual, new concepts will be introduced as we code along. Today’s goal is to have a fully 93

Chapter 4

Lists and Loops

functioning Hangman game, where we can guess, lose a life, and win or lose the game. We won’t be adding graphics, although after we complete the project together, feel free to add them yourself. To follow along with this lesson, let’s continue from our previous notebook file “Week_04” and add a markdown cell at the bottom that says, “Friday Project. Creating Hangman. ”

Final Design As always, we want to lay out our final design before we begin coding. This week will not be based around graphics, like last week, so we’ll focus on the logic and the steps necessary to run the program. Luckily for us, the logic is essentially the steps needed to play the game. 1. Select a word to play with. 2. Ask user for input. 3. Check if guess is correct

a. If it is, show the letter in the proper place

b. If it isn’t, lose a life. 4. Continue steps 2 and 3 until one of the following occurs

a. The user guesses the word correctly

b. The user loses all their lives

This is the main game play functionality. There are several other steps we need to perform before actually running the game, like declaring game variables; however, this is the primary functionality that we needed to lay out before we begin coding. Knowing this structure will allow us to stay on track with our program

Previous Line Symbols Introduced Like how we added line numbers back in Week 1, we’re going to introduce the concept of line symbols for this project and all others going forward. With the need to edit previously written lines, or even add code in the middle of the project, we’ll be

Chapter 4

Lists and Loops

introducing the concept of line symbols. These symbols will be shown by the use of three empty squares and will represent previously written code. You can see an example here. 1. if num > 1. ◻◻◻ 3. # new code will go here 5. print[ ◻◻◻ When we add lines in between previously written code, I will use these three squares to signify which line should be above and below the code we’re writing. It also means that you should leave the line unaltered. When we need to overwrite a previous line, I will let you know. Be sure to pay attention to line numbers when you see those three squares, as that will help to let you know if you missed a line or not

Note Turn lines on by pressing “L” after clicking the cell’s side

Adding Imports We’ll be writing this program in one cell, and it will be around 50 lines long. The first step is to import a few additional functions that we need. 1. # import additional functions 2. from random import choice 3. from IPython. display import clear_output The second line is importing a function called “choice” which will select a random item from a list. We’ll use this to randomize the word chosen. The third line is importing a Jupyter Notebook specific function which clears the output. When using a loop, if we don’t clear output, it will continue to output on top of each other

Chapter 4

Lists and Loops

Declaring Game Variables The next step is to understand what variables we need to run the game and declare them. If you think about Hangman and what we need to keep track of, we need to track the user’s lives, the word they are trying to guess, a list of words to choose from, and whether the game is over. 5. 6. 7. 8

# declare game variables words = [ "tree", "basket", "chair", "paper", "python" ] word = choice[words] # randomly chooses a word from words list guessed, lives, game_over = [ ], 7, False # multi variable assignment

Line seven declares a variable called word, which will select a random item from our words list. The eighth line is where we declare three variables together; guessed will be given the value of an empty list, lives will be set to 7, and game_over will be declared to False

Note As we code along, feel free to write print statements to check the value of each variable. It helps to see what we’re declaring

Generating the Hidden Word During the game, we want the user to be able to see how many letters are within the word. To do this, we can create a list of strings, where each string is an underscore. The number of items in the list will be set to the same length of the word chosen. 10. # create a list of underscores to the length of the word 11. guesses = [ "_ " ] * len[word] On line 11 we’re declaring a variable called guesses, which is set to a list of underscores. We get the proper length by multiplying the list by the length of the word

Chapter 4

Lists and Loops

Creating the Game Loop Every game has a main loop no matter the size of the program. Our main loop will perform the logic that we defined in our Final Design section. Rather than writing it all out at once, let’s take small steps. The first step is to be able to accept user input and stop playing the game. 13. # create main game loop 14. while not game_over. 15. ans = input["Type quit or guess a letter. "]. lower[ ] 17. if ans == "quit". 18. print["Thanks for playing. "] 19. game_over = True Go ahead and run the cell. If you type “quit”, the program should stop as we are looping until game_over is set to True, which only occurs when we input “quit”

Note Always make sure the cell is done running before moving on

Outputting Game Information The next step is to start outputting information to the user. Let’s output their lives and the word that they’re trying to guess in a nicely formatted statement. 14. while not game_over. ◻◻◻ 15. # output game information 16. hidden_word = "". join[guesses] 17. print[ "Word to guess. { }". format[hidden_word] ] 18. print[ "Lives. { }". format[lives] ] 20. ans = input[ ◻◻◻ Go ahead and run the cell. Depending on the word chosen, you’ll get a different output. If the word chosen was four letters, we’ll get an output of “Word to guess. _ _ _ _” and “Lives. 7”. The format is nothing new, but what about line 16? The reason we’re able 97

Chapter 4

Lists and Loops

to create a string of underscores to output in line 17 is because of the join method. It states that we want to join all the items within the guesses list together with no spaces in between. For example. >>> chars = ['h', 'e', 'l', 'l', 'o'] >>> print['-'. join[chars]] The preceding two lines would output “h-e-l-l-o”. This is a simple way to display our list as a string

Checking a Guess The next step is to check and see if the user’s input was a correct guess. We won’t alter any letters just yet, as we first want to make sure we can identify a correct guess and either output that they guessed correctly or remove a life. 24. game_over = True ◻◻◻ 25. elif ans in word. # check if letter in word 26. print["You guessed correctly. "] 27. else. # otherwise lose life 28. lives -= 1 29. print["Incorrect, you lost a life. "] Go ahead and run the cell. If you continue to guess incorrectly, you’ll notice the lives will go below zero. Be sure to guess a correct letter and incorrect letter to know that this works

Clearing Output Now that we’re getting further with our program, we can see that the loop is continually outputting information below previous outputs. Let’s begin to clear the output. 20. ans = input[ ◻◻◻ 22. clear_output[ ] # clear all previous output 24. if ans == 'quit'. ◻◻◻

Chapter 4

Lists and Loops

Go ahead and run the cell. You’ll notice that it properly clears the previous information displayed no matter how long we play. This is a Jupyter Notebook specific function

Creating the Losing Condition The next logical operation would be creating a way to lose, since our lives can go below zero. 31. print['Incorrect, ◻◻◻ 33. if lives > x >>> x x >>> x x x 2. Output Names. Write a loop that will iterate over a list of items and only output items which have letters inside of a string. Take the following list, for example, only “John” and “Amanda” should be output. >>> names = ['John', ' ', 'Amanda', 5] 3. Convert Celsius. Given a list of temperatures that are in Celsius, write a loop that iterates over the list and outputs the temperature converted into Fahrenheit. Hint. The conversion is “F = [9/5] ∗ C + 32”. >>> temps = [32, 12, 44, 29] Output would be [89. 6, 53. 6, 111. 2, 84. 2]

104

CHAPTER 5

Functions This week begins the topic of functions. Along with loops, functions can be one of the tougher topics to understand. For this reason, this entire week has been dedicated to covering functions only. This is also one of the more important topics in programming. Knowing how to use a function will greatly improve your programming skills. Functions give us the ability to make our programs much more powerful and clean while also saving us time. We’ll go over how they work on the first day, but the reason we use functions is because of the ability to write once and call repeatedly. Many of the programs that we’ve already built can benefit from the use of functions, especially games like Hangman. At the end of the week, we’ll build a program that resembles a shopping cart list. We’ll see why it’s important to separate tasks such as adding, removing, and displaying into separate functions. Overview •

How to use functions and what they are

•

Passing data around using parameters

•

Returning data from functions

•

Understanding scope and its importance

•

Creating a shopping cart program

CHALLENGE QUESTION Remember that an algorithm is nothing more than a set of step-by-step instructions. If we were to write an algorithm for changing a light bulb, what would it look like? What problems do you have to consider? How many steps are necessary? What is the most efficient method? Using the following algorithm, what problems may occur?

105

Chapter 5

Functions

1. Retrieve spare bulb. 2. Turn off switch powering current bulb. 3. Unscrew current bulb. 4. Screw in spare bulb. 5. Turn on switch powering new bulb. 6. If spare bulb does not turn on, repeat steps 1 through 5

Monday. Creating and Calling Functions Today’s lesson is all about understanding what functions are, the stages of a function, and how to write a function. We’ll find out why they are so important in programs and how they’ll make our lives easier. To follow along with the content for today, let’s open up Jupyter Notebook from our “python_bootcamp” folder. Once it’s open, create a new file, and rename it to “Week_05. ” Next, make the first cell markdown that has a header saying. “Creating & Calling Functions. ” We’ll begin working underneath that cell

What Are Functions? One of the best reference materials for programming is w3schools. 1 They even have Python tutorials. Their official documentation describes functions as the following

A function is a block of code which only runs when it is called. You can pass data, known as parameters, into a function. A function can return data as a result. 2 Programs will often need to run the same code repeatedly, and although loops help with that, we don’t want to write the same loop many times throughout our program. The solution to the issue is using a function. They essentially store code that will only run when called upon

w ww. w3schools. com/python/ www. w3schools. com/python/python_functions. asp

1 2

106

Chapter 5

Functions

All functions are generally associated with a single task or procedure. This makes it easier for us to break down our program into functions. If you build a program that needs to repeatedly print five lines of information, and you need to output it in five different places, you would need to write 25 lines of code. Using a function, you would store the five lines in a block and call the function whenever you need it, resulting in five lines for the information to output and five lines for calling the function, for a grand total of ten lines. This results in a much more efficient program

Function Syntax Like loops, functions follow an exact pattern for every functioned created. They all begin with the keyword “def”, followed by the name of the function. This name is arbitrary and can be anything except for Python keywords and previously defined functions. Directly following the name is the parenthesis, and within those are parameters. We won’t cover parameters until tomorrow so just know that parameters are optional, but parenthesis is required. Lastly, we need an ending colon like any other Python statement. See Figure 5-1 for an example

Figure 5-1. Function syntax

Writing Your First Function Now that we know what the syntactical structure looks like, let’s go ahead and write our own

107

Chapter 5

Functions

# writing your first function def printInfo[ ]. # defines what the function does when called print["Name. John Smith"] print["Age. 45"] printInfo[ ] # calls the function to run printInfo[ ] # calls the function again Go ahead and run the cell. We define a function called printInfo, which prints two lines of information each time it’s called. Below that we call the function twice, which outputs the information two times. It may not seem like a more efficient program, but imagine you needed to output that exact information 20 times in a program. It’s concise and efficient

Function Stages In Python there are two stages to each function. The first stage is the function definition. This is where you define the name of the function, any parameters it’s supposed to accept, and what it’s supposed to do in the block of code associated with it. See Figure 5-2

Figure 5-2. The two steps of a function life cycle [definition and call]

108

Chapter 5

Functions

The second stage is known as the function call. Functions will never run until called, so you can define as many functions as you’d like, but if you never call one of them, then nothing will happen. When you call a function, it will run the block of code within the definition

UDF vs. Built-in Without even knowing it, you’ve been using functions this whole time. Functions such as range, print, len, etc. , are all known as “built-in” functions. They are included in Python because they serve a specific purpose to help build our applications. Now that we’re learning about functions, we can begin to create our own known as UDFs or “user-defined functions. ”

Performing a Calculation Let’s check out one more example of a basic function, but this time do more than just print inside of the block. # performing a calculation in a function def calc[ ]. x, y = 5, 10 print[x + y] calc[ ] # will run the block of code within calc and output 15 Go ahead and run the cell. We’ll get an output of 15 every time we call the calc function here

MONDAY EXERCISES 1. Print Name. Define a function called myName, and have it print out your name when called. 2. Pizza Toppings. Define a function that prints out all your favorite pizza toppings called pizzaToppings. Call the function three times

109

Chapter 5

Functions

Although there wasn’t much coding today, it was important to understand the value of functions. Now we can separate our code into blocks, which will make the program easier to read and run

T uesday. Parameters One of the main reasons we use functions is so that we can make our code modular. Hôm nay là tất cả về cách hiểu cách sử dụng các tham số trong các hàm và chúng là gì. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ ghi chép trước của chúng ta “Week_05” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Tham số. ”

Tham số là gì? . Mặc dù các chức năng chúng tôi đã viết cho đến nay thực hiện một tác vụ cụ thể, nhưng chúng không phải là mô-đun vì chúng sẽ luôn in ra cùng một phản hồi cho mọi lệnh gọi. Khi muốn gọi một hàm với các giá trị khác nhau, bạn cần sử dụng các tham số. Trong dấu ngoặc đơn của định nghĩa hàm là nơi bạn sẽ nêu tên tham số. Đây là một tên biến tùy ý mà bạn sử dụng để tham chiếu giá trị trong khối chức năng; . Khi gọi hàm, bạn sẽ chuyển vào giá trị cần thiết để chạy khối mã với. Hãy xem Hình 5-3

Hình 5-3. Chấp nhận các tham số vào một chức năng

110

Chapter 5

Functions

Lưu ý Đối số là các giá trị được truyền vào lệnh gọi hàm. Trong hình trước, dòng 3 đang truyền đối số “John” vào hàm printName, trong đó giá trị sẽ được truyền vào tên tham số Hàm được xác định với tham số “tên” trong dấu ngoặc đơn. Một lần nữa, điều này có thể được gọi là bất cứ điều gì, nhưng chúng tôi đang mong đợi tên của một người được chuyển vào. Khối mã khi được thực thi sẽ sử dụng giá trị của tham số đó trong câu lệnh in được định dạng. Cuộc gọi trên dòng 3 là nơi chúng ta chuyển giá trị vào hàm, được gọi là đối số. Trong ví dụ này, chúng ta sẽ nhận được kết quả là “Xin chào John”. Bây giờ chúng ta có thể gọi hàm này và chuyển vào bất kỳ giá trị chuỗi nào chúng ta muốn và nó sẽ in ra. Chức năng này hiện là mô-đun

Truyền một tham số Hãy sử dụng ví dụ từ Hình 5-3 để tạo hàm đầu tiên chấp nhận một tham số. # truyền một tham số vào hàm def printName[full_name]. print["Tên của bạn là. { }". format[full_name] ] printName["John Smith"] printName["Amanda"] Tiếp tục và chạy ô. Chúng tôi sẽ nhận được hai đầu ra khác nhau ở đây sử dụng cùng một chức năng. Các tham số cho phép chúng tôi chuyển thông tin khác nhau cho mỗi cuộc gọi

Nhiều tham số Ví dụ trước chuyển một kiểu dữ liệu chuỗi vào một hàm, vì vậy hãy xem cách chuyển các số và tạo một câu lệnh in được định dạng đẹp mắt

111

Chapter 5

Functions

# truyền nhiều tham số vào hàm def addNums[num1, num2]. kết quả = num1 + num2 print[ "{ } + { } = { }". format[num1, num2, result] ] addNums[5, 8] # sẽ xuất ra 13 addNums[3. 5, 5. 5] # sẽ xuất ra 9. 0 Hãy tiếp tục và chạy ô đó. Định nghĩa hàm của chúng tôi đang mong đợi hai số được truyền vào các tham số num1 và num2. Trong khối chức năng, chúng tôi tham chiếu các giá trị này được truyền vào theo tên đối số của chúng

Truyền một Danh sách Truyền một lượng lớn dữ liệu thường dễ dàng nhất khi nó được lưu trữ trong một danh sách. Vì lý do đó, các hàm rất tốt trong việc thực hiện các tác vụ lặp đi lặp lại trên danh sách. Hãy xem một ví dụ. # sử dụng hàm để bình phương tất cả các số thông tin 1 = [ 2, 4, 5, 10 ] số 2 = [ 1, 3, 6 ] def squares[nums]. cho số trong số. in[num**2] ô vuông[số1] ô vuông[số2] Tiếp tục và chạy ô. Bạn có thể thấy rằng nó sẽ xuất ra tất cả các số bình phương. Điều này hiệu quả hơn nhiều so với việc viết vòng lặp for hai lần cho mỗi danh sách. Đây là vẻ đẹp của các hàm và truyền tham số

Lưu ý Hãy nhớ rằng nums là một tên tùy ý và là biến mà chúng ta tham chiếu trong khối chức năng

112

Chapter 5

Functions

Tham số mặc định Trong nhiều trường hợp, một tham số có thể được liên kết với một giá trị mặc định. Lấy giá trị của số pi làm ví dụ; . 14, vì vậy chúng ta có thể đặt một tham số có tên là pi thành giá trị chính xác đó. Điều này cho phép chúng ta gọi hàm với một giá trị đã được xác định cho pi. Nếu bạn muốn có một giá trị ngắn gọn hơn cho số pi, bạn có thể, nhưng nói chung là 3. 14 là đủ tốt. # thiết lập giá trị tham số mặc định def calcArea[r, pi=3. 14]. diện tích = pi * [r**2] print[ "Diện tích. { }". format[area] ] calcArea[2] # giả sử bán kính là giá trị của 2 Hãy tiếp tục và chạy ô. Bây giờ chúng ta có thể chạy hàm mà không cần truyền giá trị cho số pi. Các tham số mặc định PHẢI luôn đi sau các tham số không mặc định. Trong ví dụ này, bán kính phải được khai báo trước, sau đó là pi

Tạo các tham số tùy chọn Đôi khi bạn cần tạo các hàm nhận các đối số tùy chọn. Ví dụ tốt nhất luôn là tên đệm; . Nếu chúng ta muốn viết một hàm có thể in ra chính xác cho cả hai trường hợp, chúng ta cần đặt tên đệm thành một tham số tùy chọn. Chúng tôi làm điều này bằng cách gán một giá trị chuỗi trống làm giá trị mặc định. # đặt giá trị tham số mặc định def printName[first, last, middle=""]. nếu ở giữa. in[ "{ } { } { }". format[first, middle, last] ] else. in[ "{ } { }". format[first, last] ] printName["John", "Smith"] printName["John", "Smith", "Paul"] # sẽ xuất ra với tên đệm

113

Chapter 5

Functions

Đi trước và chạy tế bào. Cho dù bạn có nhập tên đệm hay không, chức năng sẽ chạy hiệu quả theo bất kỳ cách nào. Hãy ghi nhớ thứ tự của các tham số của chúng tôi. Các tham số phải xếp hàng từ trái sang phải theo định nghĩa hàm. Nếu “Paul” được đặt làm giá trị thứ hai sau “John” trong lần gọi thứ hai, thì hàm của chúng ta sẽ gán “Paul” vào tham số “last. ”

Gán tham số được đặt tên Trong khi gọi hàm, bạn có thể gán rõ ràng các giá trị cho tên tham số. Điều này hữu ích khi bạn không muốn trộn lẫn thứ tự các giá trị được truyền vào, vì chúng hoạt động từ trái sang phải theo mặc định. Bạn có thể sử dụng tên tham số để gán giá trị cho mọi tham số nếu muốn, nhưng điều này không cần thiết trong hầu hết thời gian. Hãy xem một ví dụ. # giải thích gán giá trị cho tham số bằng cách tham chiếu tên def addNums[num1, num2]. print[num2] print[num1] addNums[5, num2 = 2. 5] Tiếp tục và chạy ô. Ở đây, chúng tôi chỉ định rõ ràng giá trị của num2 trong cuộc gọi bằng cách sử dụng đối số từ khóa

*args Việc sử dụng *args cho phép bạn chuyển một số lượng đối số khác nhau vào một hàm. Điều này cho phép bạn thực hiện các chức năng theo mô-đun hơn. Tuy nhiên, điều kỳ diệu không phải là từ khóa "args" ở đây; . Về mặt lý thuyết, bạn có thể thay thế từ args bằng bất kỳ ai, chẳng hạn như “*data”, và nó vẫn hoạt động. Tuy nhiên, args là tiêu chuẩn mặc định và chung trong toàn ngành. Hãy xem cách chúng ta có thể sử dụng args trong lệnh gọi hàm

114

Chapter 5

Functions

# sử dụng tham số args để nhận một bộ giá trị tùy ý def outputData[name, *args]. in[ type[args] ] cho arg trong args. print[arg] outputData["John Smith", 5, True, "Jess"] Tiếp tục và chạy ô. Bạn sẽ nhận thấy rằng tham số args nhận tất cả các giá trị không được gán trong lệnh gọi dưới dạng một bộ, làm đầu ra với câu lệnh in đầu tiên của chúng ta. Sau đó, chúng tôi xuất từng đối số trong bộ dữ liệu đó. Khi bạn truy cập tham số args trong khối, bạn không cần đưa toán tử một ngôi vào. Lưu ý rằng “John Smith” không được in ra. Đó là bởi vì chúng ta có hai tham số trong định nghĩa hàm, tên và *args. Đối số đầu tiên trong lệnh gọi hàm được ánh xạ tới tham số tên và phần còn lại được chèn vào bộ args. Đây là một cơ chế hữu ích khi bạn không chắc sẽ có bao nhiêu đối số

**kwargs Giống như args, kwargs cho phép chúng ta lấy một số giá trị tùy ý trong một hàm; . Đối số từ khóa là các giá trị được truyền vào bằng khóa, cho phép chúng tôi truy cập chúng dễ dàng trong khối chức năng. Một lần nữa, điều kỳ diệu ở đây là ở hai toán tử đơn nguyên [**] chứ không phải từ khóa của kwargs. Hãy cùng kiểm tra nào. # sử dụng tham số kwargs để lấy từ điển các giá trị tùy ý def outputData[**kwargs]. print[ type[kwargs] ] print[ kwargs[ "name" ] ] print[ kwargs[ "num" ] ] outputData[name = "John Smith", num = 5, b = True] Chạy ô. Lần này, chúng ta có thể thấy rằng loại là một từ điển và chúng ta có thể xuất từng cặp khóa-giá trị trong tham số kwargs giống như chúng ta làm với bất kỳ từ điển nào khác. Các đối số từ khóa trong ô này nằm trong lệnh gọi hàm, nơi chúng tôi khai báo cụ thể một khóa và giá trị sẽ được chuyển vào hàm. 115

Chapter 5

Functions

THỨ BA BÀI TẬP 1. Đầu vào của người dùng. Yêu cầu người dùng nhập một từ và chuyển từ đó vào một hàm kiểm tra xem từ đó có bắt đầu bằng chữ hoa không. Nếu nó xuất ra “True”, ngược lại “False”. 2. Không Tên. Định nghĩa một hàm nhận hai đối số, first_name và last_ name, và làm cho cả hai trở thành tùy chọn. Nếu không có giá trị nào được truyền vào tham số, nó sẽ xuất ra “Không có tên nào được truyền vào”;

Hôm nay là tất cả về các tham số chức năng và cách sử dụng chúng. Việc sử dụng các tham số làm cho các chức năng của chúng tôi trở nên mô đun hóa trong chương trình của chúng tôi, để chúng tôi có thể giảm thành công các dòng mã được viết

Thứ Tư. Tuyên bố trả về Cho đến thời điểm này, chúng tôi đã in ra dữ liệu mà các chức năng của chúng tôi thay đổi, nhưng bạn sẽ làm gì nếu cần truy cập thông tin này sau? . Các hàm có thể thao tác dữ liệu và sau đó gửi nó trở lại nơi xảy ra lời gọi hàm để lưu thông tin sẽ được sử dụng cho lần sau. Hôm nay chúng ta sẽ học cách làm điều đó và tại sao nó hữu ích. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ ghi chép của chúng ta “Week_05” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Return Statement. ”

Cách thức hoạt động Hình 5-4 mô tả cách hai tham số được truyền vào hàm được tính trước và sau đó được trả về vị trí ban đầu của lệnh gọi được lưu vào một biến. Biến này bây giờ có thể được sử dụng sau này trong chương trình với giá trị đó

116

Chapter 5

Functions

Hình 5-4. Trả về thông tin và lưu trữ vào một biến Bạn có thể trả về bất kỳ kiểu dữ liệu nào nhưng chỉ có thể trả về một biến duy nhất. Khi bạn cần trả về nhiều hơn một phần dữ liệu, bạn sẽ trả về một tập hợp dữ liệu. >>> def trả về Nhiều[]. >>> a = 5 >>> b = 10 >>> return [a, b] # một loại dữ liệu chứa nhiều mục

Sử dụng Return Câu lệnh return được sử dụng để gửi thông tin trở lại nơi xảy ra lệnh gọi hàm. Cho đến nay, chúng tôi đã sử dụng câu lệnh in để xuất thông tin, nhưng điều này sẽ không hoạt động nếu chúng tôi cần truy cập vào giá trị đó sau này trong chương trình. Thay vào đó, chúng ta có thể trả về giá trị và lưu nó vào một biến mà chúng ta có thể làm việc sau này. Hãy kiểm tra một vài ví dụ. # sử dụng từ khóa return để trả về tổng của hai số def addNums[num1, num2]. trả về num1 + num2 num = addNums[5. 5, 4. 5] # lưu giá trị trả về thành num print[num] print[ addNums[10, 10] ] # không lưu giá trị trả về

117

Chapter 5

Functions

Đi trước và chạy tế bào. Chúng tôi sẽ nhận được 10 và 20 cho một đầu ra. Khi chúng ta gọi addNums lần đầu tiên, nó sẽ chạy chức năng với 5. 5 và 4. 5 và trả về tổng. Sau đó, nó lưu trữ giá trị trả về trong num. Lần thứ hai chúng tôi gọi chức năng, chúng tôi chỉ cần in nó tại chỗ. Từ đây, chúng ta có thể sử dụng lại giá trị được lưu trữ trong num, chứ không phải giá trị được trả về bởi lệnh gọi thứ hai

Toán tử bậc ba Toán tử bậc ba là một câu lệnh rẽ nhánh Python viết tắt. Các toán tử này có thể được sử dụng để gán giá trị cho một biến hoặc trong trường hợp này, quyết định kết quả trả về từ một hàm. # cú pháp tốc ký sử dụng toán tử bậc ba def searchList[aList, el]. trả về True nếu el trong Danh sách khác Kết quả sai = danh sách tìm kiếm [ [ "one", 2, "three" ], 2] # result = True print[kết quả] Tiếp tục và chạy ô. Toán tử bậc ba trả về True vì điều kiện đã cho được đáp ứng. Mã tương tự được viết ra bình thường sẽ giống như sau. >>> nếu el trong Danh sách. >>>return True >>>else. >>>return False Bạn nên viết ít hơn nếu có thể, nhưng đó không phải là điều cần thiết

THỨ TƯ BÀI TẬP 1. Họ và tên. Tạo một hàm lấy họ và tên và trả về hai tên được nối với nhau. 2. Đầu vào của người dùng. Trong một chức năng, yêu cầu đầu vào của người dùng. Yêu cầu hàm này trả về đầu vào đó được lưu trữ trong một biến bên ngoài hàm. Sau đó in ra đầu vào

118

Chapter 5

Functions

Hôm nay chúng ta đã học cách lấy thông tin từ một hàm. Điều này sẽ cho phép chúng tôi lưu dữ liệu mà nó thao tác để sử dụng sau này

thứ năm. Tử vi Hôm nay chúng ta sẽ thảo luận về một khái niệm quan trọng được gọi là phạm vi. Khái niệm này liên quan đến khả năng truy cập của các biến được khai báo trong một chương trình. Chúng ta sẽ xem xét các loại phạm vi khác nhau và cách xử lý chúng. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_05” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Phạm vi. ”

Các loại phạm vi Trong Python, có ba loại phạm vi. toàn cầu, chức năng và lớp. Chúng ta chưa đi qua các lớp, vì vậy chúng ta sẽ thảo luận về phạm vi lớp trong chương sau. Không biết điều đó, chúng tôi đã sử dụng hai loại phạm vi khác. Phạm vi toàn cầu là khi bạn khai báo một biến có thể truy cập được đối với toàn bộ tệp hoặc ứng dụng. Hầu hết các biến mà chúng ta đã khai báo cho đến nay đều là biến toàn cục; . Mặc dù vậy, nó vẫn ổn trong Jupyter Notebook. Phạm vi chức năng liên quan đến các biến được khai báo và chỉ có thể truy cập trong các chức năng. Một biến được khai báo bên trong một hàm không thể được truy cập bên ngoài hàm, vì một khi hàm kết thúc, các biến được khai báo bên trong nó cũng vậy

Truy cập phạm vi toàn cầu Khi các thuộc tính toàn cầu được xác định, chúng có thể truy cập được vào phần còn lại của tệp. Tuy nhiên, chúng ta phải ghi nhớ cách hoạt động của phạm vi chức năng. Ngay cả khi bạn khai báo một biến có thể truy cập vào toàn bộ tệp, nó sẽ không thể truy cập được trong hàm. Hãy xem một ví dụ. # nơi có thể truy cập các biến toàn cục number = 5 def scopeTest[]. số += 1 # không truy cập được do phạm vi cấp chức năng scopeTest[] 119

Chapter 5

Functions

Đi trước và chạy tế bào. Cuối cùng, chúng tôi sẽ nhận được lỗi vì hàm bị giới hạn ở các biến được khai báo bên trong nó hoặc được truyền vào

Lưu ý Khi truyền vào thì chỉ truyền giá trị, không truyền biến

Xử lý phạm vi chức năng Khi xử lý các biến được khai báo trong một hàm, bạn thường không cần truy cập nó bên ngoài hàm. Tuy nhiên, để truy cập giá trị đó, cách tốt nhất là trả lại giá trị đó. # truy cập các biến được định nghĩa trong hàm def scopeTest[ ]. word = "function" return word value = scopeTest[] print[value] Tiếp tục và chạy ô. Bây giờ chúng ta có quyền truy cập vào từ được xác định trong hàm, chúng ta chỉ cần gán giá trị trả về cho một biến khác để làm việc với

Các thuật toán tại chỗ Khi chuyển các biến vào một hàm, bạn chỉ cần chuyển giá trị của biến đó chứ không phải chính biến đó. Sao cho những điều sau đây sẽ không làm thay đổi biến num. >>> num = 5 >>> def changeNum[n]. >>> n += 5 >>> print[num] Mặc dù vậy, điều này khác khi thay đổi thông tin qua chỉ mục. Do cách thức hoạt động của chỉ mục, thông qua vị trí bộ nhớ chứ không phải theo tham chiếu, việc thay đổi một phần tử trong danh sách theo vị trí chỉ mục sẽ thay đổi biến ban đầu. Hãy xem một ví dụ

120

Chapter 5

Functions

# thay đổi giá trị mục danh sách theo chỉ mục sports = [ "bóng chày", "bóng đá", "khúc côn cầu", "bóng rổ" ] def change[aList]. aList[ 0 ] = "bóng đá" print["Trước khi thay đổi. { }". định dạng [thể thao] ] thay đổi [thể thao] in [ "Sau khi thay đổi. { }". format[sports] ] Hãy tiếp tục và chạy ô. Lưu ý mục đầu tiên trong danh sách thể thao thay đổi như thế nào khi hàm được gọi. Điều này là do sự thay đổi giá trị của chính chỉ mục khi danh sách được chuyển vào. Chúng được gọi là thuật toán tại chỗ vì bất kể bạn thay đổi thông tin ở đâu, nó sẽ thay đổi trực tiếp các giá trị trong vị trí bộ nhớ

THỨ NĂM BÀI TẬP 1. tên. Tạo một hàm sẽ thay đổi danh sách được truyền vào với một tham số tên tại một chỉ mục nhất định. Như vậy nếu tôi chuyển vào “Hóa đơn” và chỉ mục 1, nó sẽ thay đổi “Giàu có” thành “Hóa đơn. ” Sử dụng danh sách và định nghĩa hàm sau. >>> name = ['Bob', 'Rich', 'Amanda'] >>> def changeValue[aList, name, index]

Hôm nay rất quan trọng trong việc hiểu cách thức hoạt động của khả năng tiếp cận có thể thay đổi. Biết thông tin này sẽ giữ an toàn cho các biến của chúng tôi

Thứ sáu. Tạo Giỏ hàng Đối với dự án hôm nay, chúng ta sẽ xây dựng một ứng dụng lưu trữ các sản phẩm trong một danh sách. Chúng tôi sẽ có thể thêm, xóa, xóa và hiển thị các sản phẩm trong giỏ hàng. Tất cả các khái niệm được dạy trong vài tuần qua sẽ được sử dụng. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_05” và thêm một ô đánh dấu ở dưới cùng có nội dung “Dự án Thứ Sáu. Tạo một giỏ hàng. ” 121

Chapter 5

Functions

Thiết kế cuối cùng Vì chúng tôi đã giới thiệu các chức năng trong tuần này, thiết kế cuối cùng sẽ dựa trên logic của các hành động trong chương trình của chúng tôi. Các chức năng thực hiện một nhiệm vụ cụ thể, thường là một hành động. Đối với chương trình giỏ hàng của chúng tôi, các hành động mà chúng tôi cần xem xét là các tác vụ thêm, xóa, xóa và hiển thị các mặt hàng trong giỏ hàng. Thiết kế logic sẽ giống như Hình 5-5

Hình 5-5. Logic chương trình giỏ hàng Chúng tôi chắc chắn sẽ có một chức năng chính sẽ chứa vòng lặp và xử lý đầu vào của người dùng

Tôi thiết lập ban đầu Giống như dự án từ tuần trước, chúng tôi sẽ tạo chương trình trong một ô duy nhất, vì vậy hãy đảm bảo rằng bạn đã làm quen với các khái niệm mà chúng tôi đã sử dụng trong dự án đó. Để bắt đầu, hãy nhập hàm xóa từ Jupyter Notebook và khai báo một biến toàn cục để làm việc với. 1. 2. 4. 5

# nhập các chức năng cần thiết từ IPython. hiển thị nhập clear_output # biến danh sách toàn cầu giỏ hàng = [ ]

Chúng tôi muốn khai báo một biến toàn cục của giỏ hàng để làm việc trong suốt chương trình này. Chúng tôi sẽ sử dụng một danh sách, vì chúng tôi sẽ cần lưu trữ một số mặt hàng. Sử dụng danh sách cũng sẽ cho phép chúng tôi chỉnh sửa biến trực tiếp mà không cần phải chuyển nó qua lại do cách hoạt động của phép gán mục. 122

Chapter 5

Functions

Thêm các mục Như đã nêu trong thiết kế ban đầu, trước tiên chúng tôi sẽ muốn tạo các chức năng của mình. Chúng tôi sẽ bắt đầu với chức năng thêm các mặt hàng vào biến giỏ hàng của chúng tôi. 7. # tạo chức năng thêm hàng vào giỏ hàng 8. def addItem[item]. 9. Clear_output[] 10. xe đẩy. nối thêm [mục] 11. print["{ } đã được thêm vào. ". format[item] ] Chúng ta sẽ không gọi hàm này cho đến sau này khi chúng ta tạo vòng lặp chính. Khi được gọi, hàm này sẽ xóa đầu ra, nối thêm mục được truyền vào tham số và xuất cho người dùng

Xóa mặt hàng Tiếp theo, chúng ta sẽ tạo hàm sẽ xóa mặt hàng khỏi biến giỏ hàng của chúng ta. 13. # tạo chức năng xóa hàng khỏi giỏ hàng 14. def removeItem[item]. 15. Clear_output[] 16. thử. 17. xe đẩy. xóa [mục] 18. print["{ } đã bị xóa. ". định dạng[mục] ] 19. ngoại trừ. 20. print["Xin lỗi, chúng tôi không thể xóa mục đó. "] Chúng tôi muốn đảm bảo bao gồm mệnh đề thử và ngoại trừ xung quanh câu lệnh xóa vì khi xóa một mục không tồn tại, chương trình sẽ bị lỗi. Điều này ngăn sự cố đó xảy ra và sẽ xóa mục đúng cách hoặc xuất cho người dùng rằng mục đó không hoạt động

123

Chapter 5

Functions

Hiển thị Giỏ hàng Chúng tôi muốn người dùng có thể xem giỏ hàng bất cứ lúc nào, sử dụng một vòng lặp đơn giản. 22. # tạo hàm hiển thị hàng trong giỏ hàng 23. chắc chắn showCart[ ]. 24. Clear_output[] 25. nếu giỏ hàng. 26. print["Đây là giỏ hàng của bạn. "] 27. cho mặt hàng trong giỏ hàng. 28. in[ "- { }". định dạng[mục] ] 29. khác. 30. print["Giỏ hàng của bạn trống. "] Trong hàm, chúng tôi xóa đầu ra trước, sau đó kiểm tra xem có mặt hàng nào trong giỏ hàng không. Nếu nó trống, chúng tôi sẽ cho người dùng biết;

Xóa Giỏ hàng Một trong những chức năng cuối cùng chúng ta cần là khả năng xóa giỏ hàng. 32. # tạo chức năng xóa hàng khỏi giỏ hàng 33. rõ ràngCart[ ]. 34. Clear_output[] 35. xe đẩy. rõ ràng[ ] 36. print["Giỏ hàng của bạn trống. "] Sử dụng phương thức xóa tích hợp sẵn, chúng tôi xóa tất cả các mặt hàng trong giỏ hàng và cho người dùng biết

Tạo Vòng lặp chính Cho đến nay, chúng tôi đã tạo các chức năng để xử lý hành động của người dùng. Bây giờ chúng ta cần thiết lập chức năng chính của chương trình sẽ chứa vòng lặp chính và chức năng kết thúc. 124

Chapter 5

Functions

38. # tạo chức năng chính lặp lại cho đến khi người dùng thoát 39. chắc chắn chính[]. 40. xong = Sai 42. trong khi chưa hoàn thành. 43. ans = input["thoát/thêm/xóa/hiện/xóa. "]. thấp hơn [ ] 45. # căn cứ 46. nếu và == "thoát". 47. print["Cảm ơn vì đã sử dụng chương trình của chúng tôi. "] 48. showCart[ ] 49. xong = Đúng 51. main[ ] # chạy chương trình Hãy tiếp tục và chạy ô. Bây giờ bạn có thể gõ “thoát” và thoát khỏi chương trình; . Chúng tôi chưa thiết lập những việc cần làm ngoài việc thoát ra; . Chúng tôi cũng sử dụng biến boolean done để theo dõi xem vòng lặp chính đã hoàn thành hay chưa

Xử lý đầu vào của người dùng Bước cuối cùng của chương trình này là thêm các hàm mà chúng ta đã tạo trước đó để xử lý đầu vào của người dùng. 49. xong = True ◽◽◽ 50. yêu tinh và == "thêm". 51. item = input["Bạn muốn thêm gì?"]. tiêu đề[ ] 52. addItem[item] 53. yêu tinh và == "xóa". 54. showCart[ ] 55. item = input["Bạn muốn xóa mục nào?"]. tiêu đề[ ] 56. removeItem[item] 57. yêu tinh và == "hiển thị". 58. showCart[ ] 59. yêu tinh và == "xóa". 125

Chapter 5

Functions

60. ClearCart[] 61. khác. 62. print["Xin lỗi đó không phải là một tùy chọn. "] 64. main[ ] # chạy chương trình Hãy tiếp tục và chạy ô. Chúng tôi đã bao gồm một số câu lệnh elif để xử lý đầu vào của người dùng. Bây giờ, tùy thuộc vào những gì họ chọn, chúng tôi sẽ có thể gọi chức năng cần thiết. Trên các dòng 51 và 55, chúng tôi chấp nhận đầu vào thứ hai từ người dùng để nhập mục họ muốn thêm hoặc xóa, nhưng chúng tôi đảm bảo thay đổi nó thành tiêu đề viết thường cho mục đích phân biệt chữ hoa chữ thường. Nếu họ không chọn một nhiệm vụ thích hợp để thực hiện, chúng tôi đảm bảo rằng chúng tôi sẽ cho họ biết thông qua mệnh đề khác

Kết quả cuối cùng Chúc mừng bạn đã hoàn thành dự án này. Do quy mô của dự án, bạn có thể tìm thấy phiên bản hoàn chỉnh của mã trên Github. Để tìm mã cụ thể cho dự án này, chỉ cần mở hoặc tải xuống “Week_05. tập tin ipynb”. Nếu bạn gặp lỗi trong quá trình thực hiện, hãy đảm bảo tham chiếu chéo mã của bạn với mã trong tệp này và xem bạn có thể đã sai ở đâu

Hôm nay chúng ta đã có thể xây dựng một chương trình giỏ hàng hoàn chỉnh với việc sử dụng các hàm. Chúng ta có thể thấy rằng vòng lặp chính của chúng ta sạch sẽ và dễ đọc. Ngay cả với chương trình nhỏ này, chúng ta có thể thấy sức mạnh của các chức năng

Tóm tắt hàng tuần Tuần này là một bước tiến lớn trong việc cải thiện kỹ năng lập trình của chúng tôi. Chúng tôi đã học được rằng các hàm rất hữu ích trong việc giảm số dòng mã được viết. Chúng giúp làm cho chương trình của chúng tôi hiệu quả hơn và dễ đọc hơn. Chúng có thể trở thành mô-đun bằng cách sử dụng các tham số hoặc thậm chí trả về dữ liệu cụ thể bằng cách sử dụng từ khóa return. Một trong những khái niệm cuối cùng chúng tôi đề cập là cách xử lý phạm vi trong một dự án và cách nó xử lý khả năng truy cập biến. Cuối tuần chúng ta cùng nhau xây dựng chương trình giỏ hàng thể hiện khả năng sử dụng các hàm trong chương trình. Tuần tới, chúng ta sẽ tiếp tục xây dựng kiến thức về các loại biến nâng cao được gọi là bộ sưu tập dữ liệu. 126

Chapter 5

Functions

Giải pháp cho câu hỏi thử thách Mục đích của thử thách này là khiến bạn bắt đầu suy nghĩ về những sai sót có thể xảy ra trong các bước đã đề ra. Trước khi bắt đầu lập trình thuật toán, bạn cần hiểu điều gì có thể sai với các bước bạn đã thiết kế vì máy tính chỉ thông minh khi bạn lập trình chúng. Có một số vấn đề với thuật toán này. Đáng chú ý nhất là giữa bước 2 và 3, nơi chúng tôi cố gắng thay thế bóng đèn. Bạn kiểm tra xem bóng đèn có quá nóng khi chạm vào không? . Là con người, các bản năng cơ bản chiếm ưu thế và chúng ta sẽ ngừng chạm vào nó, nhưng máy tính sẽ tiếp tục thực hiện nhiệm vụ mà chúng được yêu cầu. Các vấn đề chói mắt khác bao gồm việc kiểm tra xem bóng đèn thay thế có đúng loại không và phải làm gì với bóng đèn mà chúng tôi vừa thay thế. Thuật toán không chỉ định một bước để loại bỏ nó đúng cách, vậy chúng ta có để nó trong tay mãi mãi không? . Khi bắt đầu xây dựng các thuật toán của riêng mình, bạn không chỉ cần đảm bảo thuật toán đó hoạt động mà còn phải nghĩ cách xử lý các tình huống dễ xảy ra lỗi

Thử thách hàng tuần Để kiểm tra kỹ năng của bạn, hãy thử những thử thách này. 1. Người treo cổ tái cấu trúc. Đây là một nhiệm vụ lớn, vì vậy hãy thực hiện nhẹ nhàng, nhưng hãy cố gắng cấu trúc lại dự án Hangman từ tuần trước để sử dụng các chức năng. Nghĩ về những hành động Hangman yêu cầu và biến những nhiệm vụ đó thành các chức năng. 2. Xóa theo chỉ mục. Trong chương trình giỏ hàng, hãy thiết lập chức năng xóa để bạn cũng có thể xóa qua chỉ mục. Thiết lập danh sách để nó in ra dưới dạng danh sách được đánh số và khi được yêu cầu xóa một mục, người dùng cũng có thể nhập một số bên cạnh mục danh sách. Ví dụ: sử dụng cách sau, bạn có thể nhập “1” để xóa “Nho”. >>> 1] Nho >>> Bạn muốn loại bỏ cái gì?

127

CHƯƠNG 6

Bộ sưu tập dữ liệu và tệp Có một số cấu trúc dữ liệu trong Python. Chúng tôi sẽ đề cập đến từ điển, bộ, bộ và bộ đóng băng trong tuần này để bổ sung kiến thức về bộ sưu tập. Mỗi người có một mục đích cụ thể vì chúng ta sẽ thấy sự khác biệt giữa mỗi. Biết cách làm việc với các tệp bằng bất kỳ ngôn ngữ nào là rất quan trọng. Để làm việc với dữ liệu, chúng ta cần biết cách đọc và ghi từ một số loại tệp. Chúng tôi sẽ giới thiệu cách làm việc với tệp văn bản và tệp CSV. Tổng quan •

Tìm hiểu từ điển

•

Làm việc với từ điển

•

Tìm hiểu các bộ sưu tập dữ liệu quan trọng khác

•

Làm việc với các tập tin

•

Tạo cơ sở dữ liệu mẫu với các tệp

CÂU HỎI THỬ THÁCH Thử thách của tuần này là viết một hàm kiểm tra xem một từ có phải là một từ đối xứng không. Hàm sẽ nhận một tham số duy nhất và trả về Đúng hoặc Sai. Hãy thử viết chức năng ra giấy trước, sau đó hãy thử lập trình nó

thứ hai. Từ điển Hôm nay, chúng ta sẽ học về một bộ sưu tập dữ liệu có giá trị trong từ điển. Chúng lưu trữ thông tin bằng các khóa và hiệu quả hơn nhiều so với danh sách Python

129

Chương 6

Thu thập dữ liệu và tệp

Để theo dõi nội dung của ngày hôm nay, hãy mở Jupyter Notebook từ thư mục “python_bootcamp” của chúng tôi. Sau khi mở, hãy tạo một tệp mới và đổi tên thành “Week_06. ” Tiếp theo, tạo ô đánh dấu đầu tiên có tiêu đề cho biết. “Từ điển. ” Chúng tôi sẽ bắt đầu làm việc bên dưới tế bào đó

Từ điển là gì? . "Không có thứ tự" nghĩa là cách nó được lưu trữ trong bộ nhớ. Nó không thể truy cập thông qua một chỉ mục, thay vào đó nó được truy cập thông qua một khóa. Danh sách được gọi là bộ sưu tập dữ liệu được sắp xếp vì mỗi mục được chỉ định một vị trí cụ thể. Từ điển hoạt động giống như từ điển ngoài đời thực, trong đó từ khóa là từ và giá trị là định nghĩa. Từ điển rất hữu ích để làm việc với dữ liệu lớn, dữ liệu được ánh xạ, tệp CSV, API, gửi hoặc nhận dữ liệu, v.v.

Khai báo một Từ điển Giống như các biến khác, tên của biến nằm ở bên trái của toán tử bằng và ở bên phải là từ điển. Tất cả các từ điển được tạo bằng cách sử dụng dấu ngoặc nhọn mở và đóng. Ở giữa các dấu ngoặc nhọn, chúng tôi xác định các cặp khóa-giá trị của mình. Các khóa có thể được khai báo CHỈ bằng các chuỗi hoặc số. Có dấu hai chấm ngăn cách khóa và giá trị. Sau dấu hai chấm là giá trị và đây có thể là bất kỳ loại dữ liệu nào bao gồm các bộ sưu tập dữ liệu khác hoặc thậm chí một từ điển khác. # khai báo biến từ điển trống = { } # người trong từ điển trống = { "tên". "John Smith" } # từ điển có một cặp khóa/giá trị customer = { "name". "Chết", "tuổi". 26 } # từ điển in hai cặp khóa/giá trị[khách hàng]

130

Chương 6

Thu thập dữ liệu và tệp

Đi trước và chạy tế bào. Ở đây chúng ta có thể thấy rằng chúng ta khai báo ba từ điển khác nhau, một từ điển trống, một từ có một cặp khóa-giá trị và một từ điển khác có nhiều cặp giá trị khóa. Tất cả các cặp khóa-giá trị phải được phân tách bằng dấu phẩy. Tiếp theo chúng ta sẽ xem cách truy cập dữ liệu này

Lưu ý Bạn cũng có thể sử dụng dict[] để khai báo một từ điển rỗng

Truy cập thông tin từ điển Tất cả dữ liệu được lưu trữ trong từ điển được truy cập thông qua khóa được liên kết với giá trị bạn đang cố truy cập. Chúng tôi chỉ cần viết tên của từ điển theo sau dấu ngoặc vuông. Bên trong dấu ngoặc vuông là khóa. Điều này sẽ lấy giá trị được lưu trữ tại khóa đó. # truy cập thông tin từ điển thông qua các phím person = { "name". 'John" } print[ person[ "name" ] ] # truy cập thông tin thông qua phím Hãy tiếp tục và chạy ô. Điều này sẽ xuất ra “John” vì đó là những gì được lưu trữ tại khóa “tên”

Sử dụng phương thức Get Một cách khác để truy xuất thông tin là sử dụng phương thức get[]. Sự khác biệt chính giữa việc sử dụng phương thức này và cách truy cập giá trị trước đó là phương thức get sẽ không gây ra lỗi khóa. Nếu khóa không tồn tại, nó sẽ chỉ trả về “Không”. Bạn cũng có thể thêm vào đối số thứ hai trong lệnh gọi để chương trình trả về một kiểu dữ liệu cụ thể hơn. Hãy thử. # sử dụng phương thức get để truy cập thông tin từ điển person = { "name". 'John" } print[người. get["name"] ] # lấy giá trị của khóa name như trước print[ person. get["tuổi", "Không có tuổi. "] ] # get là một cách an toàn để truy xuất thông tin 131

Chương 6

Thu thập dữ liệu và tệp

Đi trước và chạy tế bào. Ở câu lệnh in thứ hai, chúng ta sẽ nhận được thông báo “Age is not available” vì khóa “age” không tồn tại. Điều này mang lại cho chúng tôi một cách truy xuất thông tin an toàn hơn

Từ điển có Danh sách Từ điển trở nên mạnh mẽ khi bạn bắt đầu làm việc với các tập hợp dữ liệu dưới dạng giá trị. # lưu trữ danh sách trong từ điển và truy cập dữ liệu = { "sports". [ "bóng chày", "bóng đá", "khúc côn cầu", "bóng đá" ] } print[ data["sports"][0] ] # trước tiên hãy truy cập khóa, sau đó là chỉ mục Tiếp tục và chạy ô. Để truy cập vào danh sách, trước tiên chúng ta phải truy cập vào phím “sports”. Sau đó, chúng tôi có thể truy cập các mục như bất kỳ danh sách nào khác thông qua chỉ mục. Điều này sẽ xuất ra "bóng chày". Hãy nhớ rằng chúng ta không thể tạo từ điển lưu trữ danh sách mà không đính kèm khóa trước. # lưu trữ danh sách trong từ điển không đúng cách sports = [ "bóng chày", "bóng đá", "khúc côn cầu", "bóng đá" ] sports_dict = dict[ sports ] # sẽ tạo ra lỗi, không có phím nào. Điều này sẽ tạo ra lỗi vì không có khóa nào được liên kết với biến thể thao. Để lưu trữ danh sách này đúng cách, bạn sẽ viết như sau. >>> sports_dict = dict[ { "sports". các môn thể thao } ]

Danh sách với Từ điển Sự kết hợp của danh sách trong từ điển và ngược lại có thể trở nên khó hiểu khi cố gắng tìm ra cách truy cập thông tin. Luôn nhớ danh sách được lập chỉ mục và từ điển sử dụng khóa. Tùy thuộc vào thứ tự của dữ liệu được lưu trữ, bạn sẽ cần thực hiện cái này hay cái kia trước. Khi một danh sách đang lưu trữ một từ điển, trước tiên bạn cần truy cập vào từ điển đó theo chỉ mục. Sau đó, bạn có quyền truy cập vào các cặp khóa-giá trị trong từ điển. Hãy xem một ví dụ

132

Chương 6

Thu thập dữ liệu và tệp

# lưu trữ từ điển trong danh sách và truy cập dữ liệu = [ "John", "Dennis", { "name". "Kirsten" } ] print[ data[2] ] # từ điển nằm trong chỉ mục 2 print[ data[2]["name"] ] # trước tiên hãy truy cập vào chỉ mục, sau đó truy cập vào khóa Tiếp tục và chạy ô. Đầu tiên, chúng tôi truy cập mục trong chỉ mục thứ hai, đó là từ điển của chúng tôi. Sau đó, chúng tôi truy cập giá trị được lưu trữ tại khóa “tên”, là đầu ra của “Kirsten”

Lưu ý Hãy thật cẩn thận khi sử dụng số cho các phím

Từ điển với Từ điển Từ điển rất mạnh mẽ và hiệu quả do cách chúng được lưu trữ trong bộ nhớ. Thông thường, bạn sẽ muốn sử dụng từ điển làm giá trị cho các cặp khóa-giá trị của mình. Hãy xem một ví dụ. # lưu trữ từ điển trong từ điển và truy cập dữ liệu từ đó = { "team". "Boston Red Sox", "thắng". {"2018". 108, "2017". 93 } } print[ data["wins"] ] # sẽ xuất từ điển trong key win print[ data["wins"]["2018"] ] # truy cập key win trước, sau đó truy cập key tiếp theo Hãy tiếp tục và chạy . Điều này sẽ xuất ra “108” trong câu lệnh thứ hai. Chúng tôi có thể truy cập thông tin này bằng cách truy cập khóa đầu tiên của “thắng”, tiếp theo là khóa thứ hai của “2018”

THỨ HAI BÀI TẬP 1. Đầu vào của người dùng. Hỏi tên và tuổi của người dùng, sau đó tạo từ điển với các cặp khóa-giá trị đó. Xuất từ điển sau khi tạo. 2. truy cập thành phần. Xuất tất cả các thành phần từ danh sách sau trong khóa “ingredients” bằng vòng lặp for. 133

Chương 6

Thu thập dữ liệu và tệp

>>> pizza = { >>> 'ingredients'. ['phô mai', 'xúc xích', 'ớt'] >>> }

Bộ sưu tập dữ liệu cho phép chúng tôi làm việc với dữ liệu lớn khi chúng được lưu trữ trong các cặp khóa-giá trị. Hãy nhớ rằng dữ liệu được truy cập thông qua các phím

Thứ ba. Làm việc với Từ điển Bài học hôm nay sẽ đề cập đến cách thêm dữ liệu, thao tác với dữ liệu, xóa các cặp khóa-giá trị và lặp qua các từ điển. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_06” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Làm việc với Từ điển. ”

Thêm thông tin mới Bạn thường cần thêm các cặp khóa-giá trị mới sau khi khai báo từ điển. Hãy xem làm thế nào. # thêm cặp khóa/giá trị mới vào từ điển car = { "year". 2018 } car["color"] = "Xanh" print[ "Năm. { } \t Màu. { }". format[ car["year"], car["color"] ] ] Hãy tiếp tục và chạy ô. Để thêm các cặp mới, ở phía bên trái của toán tử bằng, bạn cung cấp tên từ điển, theo sau là khóa mới trong ngoặc đơn. Ở phía bên phải là bất cứ giá trị nào bạn muốn. Điều này sẽ xuất ra một chuỗi được định dạng độc đáo với thông tin xe hơi của chúng tôi

Lưu ý Kể từ Python, 3. 7 từ điển được sắp xếp theo mặc định. Trong các phiên bản Python cũ hơn, các cặp khóa-giá trị không phải lúc nào cũng giữ trật tự. Bạn sẽ cần sử dụng OrderedDict[ ]

134

Chương 6

Thu thập dữ liệu và tệp

Thay đổi thông tin Thay đổi cặp khóa-giá trị hoàn toàn giống như thêm một cặp mới. Nếu khóa tồn tại, nó chỉ ghi đè lên giá trị trước đó; . # cập nhật giá trị cho cặp khóa/giá trị đã tồn tại car = { "year". 2018, "màu sắc". "Xanh" } xe["màu"] = "Đỏ" print[ "Năm. { } \t Màu. { }". format[ car["year"], car["color"] ] ] Hãy tiếp tục và chạy ô. Giống như cách chúng ta đã khai báo một cặp key-value mới trước đó, vì key “color” đã tồn tại trong từ điển nên nó chỉ ghi đè lên giá trị trước đó

Xóa thông tin Đôi khi bạn cần xóa một cặp nhất định. Để làm như vậy, bạn sẽ cần sử dụng chức năng del. # xóa cặp khóa/giá trị khỏi từ điển car = { "year". 2018 } hãy thử. del car["year"] print[car] except. print["Khóa đó không tồn tại"] Hãy tiếp tục và chạy ô. Hãy thật cẩn thận khi xóa các cặp khóa-giá trị. Nếu khóa bạn đang cố xóa không tồn tại, nó sẽ làm hỏng chương trình. Để tránh vấn đề đó, chúng tôi sử dụng một lần thử/ngoại trừ

Lặp lại Từ điển Từ điển có thể lặp lại giống như danh sách. Tuy nhiên, họ có ba phương pháp khác nhau để làm như vậy. Bạn có thể lặp lại cả khóa và giá trị cùng nhau, chỉ khóa hoặc chỉ giá trị. 135

Chương 6

Thu thập dữ liệu và tệp

Chỉ lặp các phím Để lặp qua một từ điển trong khi chỉ truy cập các phím, bạn sẽ sử dụng. phương thức keys[]. # lặp qua từ điển thông qua các phím person = { "name". "John", "tuổi". 26 } để nhận chìa khóa trực tiếp. phím [ ]. print[key] print[ person[key] ] # sẽ xuất giá trị tại khóa hiện tại Hãy tiếp tục và chạy ô. Khi chúng tôi lặp lại từng người, biến khóa tạm thời của chúng tôi sẽ bằng với từng tên khóa. Điều này vẫn cung cấp cho chúng tôi khả năng truy cập từng giá trị bằng cách sử dụng biến chính của chúng tôi

Chỉ lặp lại các giá trị Khi bạn không cần truy cập các phím, hãy sử dụng. phương pháp giá trị [ ] là tốt nhất. # lặp qua từ điển thông qua các giá trị person = { "name". "John", "tuổi". 26 } cho giá trị cá nhân. giá trị[ ]. print[value] Tiếp tục và chạy ô. Chúng tôi sẽ không có quyền truy cập vào các tên khóa, nhưng đối với phương pháp này, chúng tôi chỉ cố gắng lấy các giá trị. Giá trị biến tạm thời của chúng tôi sẽ lưu trữ từng giá trị từ các cặp khóa-giá trị khi chúng tôi lặp lại từng người

Vòng lặp các cặp khóa-giá trị Nếu bạn cần khả năng truy cập cả khóa và giá trị, thì bạn sẽ muốn sử dụng. phương thức item[]. Cách tiếp cận này sẽ gán hai biến tạm thời thay vì một. # lặp qua từ điển thông qua cặp khóa/giá trị person = { "name". "John", "tuổi". 26 } cho khóa, giá trị trực tiếp. mặt hàng[ ]. in[ "{ }. { }". định dạng [khóa, giá trị] ] 136

Chương 6

Thu thập dữ liệu và tệp

Đi trước và chạy tế bào. Khi chúng tôi lặp lại từng người, các cặp khóa-giá trị được gán cho các biến khóa và giá trị tạm thời tương ứng của chúng. Bây giờ chúng tôi có quyền truy cập vào cả hai dễ dàng

Lưu ý Tên biến tạm thời thường được gọi là “k” và “v. ”

THỨ BA BÀI TẬP 1. Đầu vào của người dùng. Khai báo một từ điển trống. Hỏi người dùng về tên, địa chỉ và số của họ. Thêm thông tin đó vào từ điển và lặp lại thông tin đó để hiển thị cho người dùng. 2. Giải quyết vấn đề. Có gì sai với đoạn mã sau. >>> người = { 'tên', 'John Smith' } >>> in [người ['tên']]

Hôm nay rất quan trọng trong việc hiểu cách làm việc với từ điển. Hãy nhớ rằng việc thêm và thay đổi các cặp khóa-giá trị là cùng một cú pháp

Thứ Tư. Tuples, Sets, Frozensets Python bao gồm một số bộ sưu tập dữ liệu khác, tất cả đều có các tính năng riêng. Hôm nay, chúng ta sẽ xem xét ba cái khác đôi khi có thể hữu ích. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ ghi chép của chúng ta “Week_06” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Tuples, Sets, Frozensets. ”

Bộ dữ liệu là gì? . Khi một cái gì đó là bất biến, điều đó có nghĩa là nó không thể bị thay đổi một khi đã khai báo. Bộ dữ liệu hữu ích để lưu trữ thông tin mà bạn không muốn thay đổi. Chúng được sắp xếp giống như danh sách, vì vậy bạn có thể lặp qua chúng bằng chỉ mục. 137

Chương 6

Thu thập dữ liệu và tệp

Khai báo một Tuple Để khai báo một tuple, bạn sử dụng dấu phẩy để phân tách hai hoặc nhiều mục. Các danh sách được biểu thị bằng dấu ngoặc vuông ở bên ngoài, trong khi các bộ dữ liệu có thể được khai báo bằng dấu ngoặc đơn tùy chọn. Có nhiều khả năng chúng được khai báo bằng dấu ngoặc đơn vì nó dễ đọc hơn. Hãy xem một ví dụ. # khai báo một bộ t1 = ["hello", 2, "hello"] # có dấu ngoặc đơn t2 = True, 1 # không có dấu ngoặc in[ type[t1], type[t2] ] # cả hai đều là bộ t1[0] = 1 . Bạn có thể thấy rằng chúng tôi xuất các loại biến của mình, cả hai đều xuất ra “tuple”. Như đã nêu, các bộ dữ liệu được khai báo có và không có dấu ngoặc đơn. Dòng cuối cùng trong ô này sẽ tạo ra lỗi vì không thể thay đổi các mục của bộ dữ liệu sau khi đã khai báo. Cách duy nhất để ghi đè dữ liệu trong một bộ là khai báo lại toàn bộ bộ

Tập hợp là gì? . Một tập hợp là một tập hợp thông tin giống như một danh sách; . Chúng cũng là một bộ sưu tập không có thứ tự. Điều này có nghĩa là chúng không thể được truy cập theo chỉ mục mà theo chính giá trị như các khóa từ điển. Tuy nhiên, chúng có thể được lặp đi lặp lại, giống như cách các khóa từ điển có thể được lặp lại. Bộ là thiết thực trong các tình huống lưu trữ các mặt hàng độc đáo

Khai báo một tập hợp Có hai cách để khai báo một tập hợp. Cách thứ nhất là sử dụng từ khóa “set”, theo sau là dấu ngoặc đơn và đặt trong dấu ngoặc vuông. Cách thứ hai, thực tế hơn, trông giống như một từ điển được khai báo bằng cách sử dụng một bộ dấu ngoặc nhọn. Hãy cùng kiểm tra nào

138

Chương 6

Thu thập dữ liệu và tệp

# khai báo một tập hợp s1 = set[ [1, 2, 3, 1] ] # sử dụng từ khóa set và dấu ngoặc vuông s2 = {4, 4, 5} # sử dụng dấu ngoặc nhọn, như từ điển print[ type[s1], type . add[5] # sử dụng phương thức add để thêm các mục mới vào tập hợp s1. remove[1] # sử dụng phương thức remove để loại bỏ giá trị 1 print[s1] # thông báo khi in nó đã xóa "1" thứ hai ở cuối Hãy tiếp tục và chạy ô. Chúng ta sẽ thấy rằng nó xuất ra các loại cho cả hai biến dưới dạng "bộ". Khi chúng ta xuất giá trị của biến s1, nó chỉ xuất ra “1, 2, 3”. Hãy nhớ rằng các bộ là các vật phẩm duy nhất, vì vậy nó giảm giá trị "1" thứ hai. Các bộ có nhiều phương thức khác nhau cho phép chúng ta thêm, xóa và thay đổi thông tin bên trong chúng, như đã thấy với các dòng thêm/xóa

Frozensets là gì? . Chúng là bất biến, không có thứ tự và duy nhất. Đây là những thông tin hoàn hảo cho thông tin nhạy cảm như số tài khoản ngân hàng, vì bạn sẽ không muốn thay đổi những thông tin đó. Chúng có thể được lặp đi lặp lại, nhưng không được lập chỉ mục

Khai báo một Frozenset Để khai báo một Frozenset, bạn sử dụng từ khóa “frozenset” theo sau là dấu ngoặc đơn và kèm theo dấu ngoặc vuông. Đây là cách duy nhất bạn có thể khai báo một bộ đóng băng. Hãy xem một ví dụ. # khai báo một Frozenset fset = Frozenset[ [1, 2, 3, 4] ] print[ type[fset] ] Tiếp tục và chạy ô. Chúng tôi sẽ không sử dụng các bộ đóng băng quá thường xuyên trong cuốn sách này, nhưng tất cả các bộ sưu tập dữ liệu này đều phục vụ một mục đích cụ thể để sử dụng trong ngôn ngữ Python

139

Chương 6

Thu thập dữ liệu và tệp

Sự khác biệt của bộ sưu tập dữ liệu Bảng 6-1 trình bày tóm tắt về sự khác biệt giữa mỗi bộ sưu tập

Bảng 6-1. Bộ sưu tập tương đồng và khác biệt Thu thập dữ liệu

đặt hàng

Có thể lặp lại

Độc nhất

bất biến

có thể thay đổi

List

Yes

Từ điển

Yes

chỉ phím

chỉ giá trị

Tuple

Yes

Bộ

Yes

băng giá

Yes

THỨ TƯ BÀI TẬP 1. Đầu vào của người dùng. Yêu cầu người dùng nhập bao nhiêu số tài khoản ngân hàng tùy thích và lưu trữ chúng trong danh sách ban đầu. Khi người dùng nhập xong thông tin, hãy chuyển đổi danh sách thành một bộ đóng băng và in nó ra. 2. chuyển đổi. Chuyển đổi danh sách sau thành một tập hợp các giá trị duy nhất. In nó ra sau để kiểm tra không có bản sao. >>> số = [3, 4, 3, 7, 10]

Hôm nay chúng tôi có thể xem ba bộ sưu tập dữ liệu khác. Mỗi người có một mục đích, mặc dù chúng tôi chủ yếu làm việc với từ điển và danh sách

thứ năm. Đọc và ghi tệp Tùy thuộc vào loại chương trình bạn đang viết, bạn sẽ cần lưu hoặc truy cập thông tin. Để làm như vậy, bạn sẽ cần hiểu cách làm việc với các tệp, cho dù đó là tạo, viết hay đọc.

140

Chương 6

Thu thập dữ liệu và tệp

Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_06” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Đọc & Viết tệp. ”

Làm việc với Tệp văn bản Theo mặc định, Python đi kèm với hàm open[] cho phép chúng ta tạo hoặc sửa đổi tệp. Hàm này chấp nhận hai tham số, tên tệp và chế độ. Nếu tên tệp tồn tại, thì nó sẽ chỉ mở tệp để sửa đổi; . Chế độ liên quan đến cách Python mở và hoạt động với tệp. Chẳng hạn, nếu bạn chỉ cần lấy thông tin từ tệp, bạn sẽ mở nó lên để đọc. Điều này sẽ cho phép bạn làm việc với tệp mà không vô tình thay đổi tệp. Hãy xem cách mở, ghi và đọc tệp văn bản. 1. 2. 3. 4. 5. 6. 7. số 8. 9

# mở/tạo và ghi vào tệp văn bản f = open["test. txt", "w+"] # mở tệp ở chế độ viết và đọc f. write["đây là bài kiểm tra"] f. close[ ] # đọc từ file văn bản f = open["test. txt", "r"] dữ liệu = f. đọc [ ] f. đóng [ ] in [dữ liệu]

Đi trước và chạy tế bào. Hãy đi qua từng dòng này. Ta mở tệp ở chế độ ghi và đọc để chỉnh sửa toàn bộ và gán giá trị vào biến f. Ở dòng 3, chúng ta sử dụng phương thức write[] để viết câu của mình vào tệp. Sau đó, chúng tôi đóng tập tin. Bất cứ khi nào bạn mở một tệp, bạn phải luôn đóng nó. Sau khi chúng tôi đã tạo và ghi vào tệp thử nghiệm của mình, chúng tôi sẽ mở lại tệp đó ở chế độ chỉ đọc. Ở dòng 7, chúng tôi sử dụng phương thức read[] để đọc tất cả nội dung của tệp thành một chuỗi duy nhất, chuỗi này được gán cho biến dữ liệu của chúng tôi. Sau đó, chúng tôi xuất thông tin

Lưu ý Chế độ “w” sẽ ghi đè lên toàn bộ tệp. Sử dụng “a” để nối thêm

141

Chương 6

Thu thập dữ liệu và tệp

Ghi vào tệp CSV Tệp CSV hoạt động với dữ liệu bằng cách phân tách dấu phẩy giữa mỗi ô. Đây được gọi là cấu trúc dữ liệu dạng bảng. Để bắt đầu làm việc với chúng, Python có một thư viện mặc định gọi là “csv. ” Chúng tôi sẽ cần nhập nội dung đó để làm việc với họ. Sau khi nhập thư viện này, chúng tôi sẽ sử dụng phương pháp thứ hai để mở tệp bằng từ khóa “with”. Khái niệm này hoạt động giống như một vòng lặp while, do đó trong khi tệp đang mở, chúng ta có thể làm việc với nó và khi khối mã chạy xong, nó sẽ tự động đóng tệp cho chúng ta. Hãy xem ví dụ. 1. # mở/tạo và ghi vào tệp csv 2. nhập csv 3. với mở ["kiểm tra. csv", chế độ="w", dòng mới=""] dưới dạng f. 4. nhà văn = csv. nhà văn[f, dấu phân cách=","] 5. nhà văn. writerow[ ["Tên", "Thành phố"] ] 6. nhà văn. writerow[ ["Craig Lou", "Taiwan"] ] Tiếp tục và chạy ô. Hãy đi qua từng dòng này. Chúng tôi nhập thư viện CSV trên dòng 2. Sau đó, chúng tôi mở tệp ở chế độ ghi dưới dạng biến f. Chúng tôi cũng đã đặt tham số dòng mới thành một chuỗi trống để nó không tạo ra các dòng trống giữa các hàng. Ở dòng 4, chúng tôi tạo một biến nhà văn cho phép chúng tôi ghi vào tệp CSV. Hai dòng cuối cùng ghi một vài dòng dữ liệu vào tệp CSV. Khi khối hoàn tất, tệp sẽ tự động đóng và chúng tôi đã hoàn tất. Hãy tiếp tục và kiểm tra tệp; . Hãy nhớ rằng chế độ ghi sẽ luôn ghi đè lên bất kỳ dữ liệu nào có trong tệp trước đó

Đọc từ tệp CSV Để đọc dữ liệu từ tệp CSV mà chúng tôi vừa tạo, chúng tôi chỉ cần đặt chế độ thành đọc. 1. # đọc từ tệp csv 2. với mở ["kiểm tra. csv", mode="r"] dưới dạng f. 3. người đọc = csv. trình đọc [f, dấu phân cách = ",""] 4. cho hàng trong đầu đọc. 5. in [hàng] 142

Chương 6

Thu thập dữ liệu và tệp

Đi trước và chạy tế bào. Bạn sẽ nhận thấy rằng nó xuất ra mỗi hàng dưới dạng danh sách có hai mục bên trong. Chúng tôi đã mở tệp ở chế độ đọc dưới dạng biến f. Sau đó, chúng tôi tạo một đối tượng trình đọc thông qua thư viện CSV để đọc nội dung trong tệp cho chúng tôi. Sau đó, chúng tôi lặp qua biến reader và in ra từng phần dữ liệu

Lưu ý Các đối tượng sẽ được đề cập trong một tuần sau

Chế độ tệp trong Python Bảng 6-2 hiển thị thêm một vài chế độ tệp mà bạn có thể sử dụng trong Python

Bảng 6-2. Chế độ tập tin Chế độ

Description

'r'

Đây là chế độ mặc định. Nó mở tệp chỉ để đọc

'w'

Mở tệp để viết. Nếu tệp không tồn tại, nó sẽ tạo một

'x'

Tạo một tập tin mới. Nếu tệp tồn tại, thao tác không thành công

'một'

Mở ở chế độ chắp thêm. Nếu tệp không tồn tại, nó sẽ tạo một

'b'

Mở ở chế độ nhị phân

'+'

Sẽ mở một tập tin để đọc và viết. Tốt cho việc cập nhật

THỨ NĂM BÀI TẬP 1. Đầu vào của người dùng. Hỏi người dùng về số yêu thích của họ và lưu nó vào tệp văn bản. 2. Kết xuất dữ liệu. Sử dụng từ điển của dữ liệu sau, lưu thông tin vào tệp csv với các khóa là tiêu đề và giá trị là các hàng dữ liệu. >>> dữ liệu = { 'tên'. ['Dave', 'Dennis', 'Peter', 'Jess'], 'ngôn ngữ'. ['Python', 'C', 'Java', 'Python'] }

143

Chương 6

Thu thập dữ liệu và tệp

Hôm nay chúng ta đã học cách làm việc với tệp văn bản và tệp CSV. Có hai phương pháp để làm việc với các tệp, mỗi phương pháp đều có mục đích riêng, nhưng nhìn chung câu lệnh with sẽ dễ làm việc hơn

Thứ sáu. Tạo Cơ sở dữ liệu người dùng bằng tệp CSV Đối với dự án của tuần này, chúng ta sẽ xây dựng một bản sao của cơ sở dữ liệu người dùng bằng các tệp CSV. Chúng tôi sẽ có thể nhận thông tin đầu vào và cho phép người dùng đăng nhập/đăng xuất/đăng ký. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_06” và thêm một ô đánh dấu ở dưới cùng có nội dung “Dự án Thứ Sáu. Tạo cơ sở dữ liệu người dùng bằng tệp CSV. ”

Thiết kế cuối cùng Dự án tuần này hoàn toàn là về logic. Chúng tôi cần hiểu cách thiết lập quy trình từng bước để đăng nhập và đăng xuất người dùng. Có ba phần chính trong chương trình này, đăng ký người dùng, đăng nhập người dùng và vòng lặp chính sẽ chạy chương trình. Biết rằng hai nhiệm vụ đầu tiên là các nhiệm vụ, chúng ta có thể tạo các chức năng từ chúng và gọi chúng khi cần thiết trong vòng lặp chính. Hãy tiếp tục và đặt ra quy trình hợp lý cho chương trình này. 1. Kiểm tra xem người dùng đã đăng nhập chưa

a. Nếu đã đăng nhập, hãy hỏi xem họ có muốn đăng xuất/thoát không

i. Thoát hoặc đăng xuất người dùng và khởi động lại. b. Nếu không, hãy hỏi xem họ có muốn đăng nhập/đăng ký/thoát không. tôi. Nếu đăng nhập, hãy yêu cầu người dùng cung cấp e-mail/mật khẩu

Nếu đúng, hãy đăng nhập người dùng và khởi động lại

Khác, hiển thị lỗi và khởi động lại

144

ii. Nếu đăng ký, yêu cầu e-mail/password/password2

Nếu mật khẩu khớp, hãy lưu người dùng và khởi động lại

Khác, hiển thị lỗi và khởi động lại

iii. Nếu thoát, hãy nói cảm ơn và thoát khỏi chương trình

Chương 6

Thu thập dữ liệu và tệp

Đây là sơ đồ chương trình cho vòng lặp chính của chúng tôi. Bây giờ bạn đã biết chính xác chương trình sẽ chạy như thế nào, tôi khuyên bạn nên thử và tự xây dựng chương trình trước khi tiếp tục. Bằng cách đó, bạn sẽ có thể tham khảo mã của tôi, xem bạn có thể mắc lỗi ở đâu, v.v. Vòng lặp sẽ tiếp tục chạy cho đến khi người dùng thoát và cho phép họ đăng ký hoặc đăng nhập. Sau khi đăng nhập, bạn sẽ chỉ có thể đăng xuất hoặc thoát. Nó đơn giản nhưng sẽ cung cấp một số thông tin chi tiết về cách xử lý các hệ thống menu

Thiết lập Các mục nhập cần thiết Trước tiên, hãy bắt đầu bằng cách nhập các tệp và chức năng cần thiết để chạy chương trình. 1. # nhập tất cả các gói cần thiết để sử dụng 2. nhập csv 3. từ IPython. hiển thị nhập clear_output Chúng tôi sẽ viết tất cả mã trong một ô duy nhất, vì vậy không cần phải chạy ô đó ngay bây giờ. Chúng tôi đã tiếp tục và nhập thư viện CSV để có thể làm việc với các tệp CSV, cũng như chức năng xóa đầu ra cho phép chúng tôi xóa các câu lệnh sổ ghi chép của mình khỏi ô

Xử lý đăng ký người dùng Tiếp theo, chúng ta sẽ thiết kế chức năng đăng ký người dùng. Hãy kiểm tra chức năng đó. 5. # xử lý đăng ký người dùng và ghi vào csv 6. def registerUser[ ]. 7. với open["người dùng. csv", mode="a", newline=""] dưới dạng f. số 8. nhà văn = csv. nhà văn[f, dấu phân cách=","] 10. print["Để đăng ký, vui lòng nhập thông tin của bạn. "] 11. email = input["E-mail. "] 12. mật khẩu = đầu vào ["Mật khẩu. "] 13. password2 = input["Nhập lại mật khẩu. "] 15. clear_output[] 17. nếu mật khẩu == mật khẩu2. 18. nhà văn. writerow[ [email, mật khẩu] ] 145

Chương 6

Thu thập dữ liệu và tệp

19. print["Bạn đã đăng ký. "] 20. khác. 21. print["Đã xảy ra lỗi. Thử lại. "] Chúng tôi bắt đầu bằng cách xác định hàm và mở tệp CSV có tên là “người dùng. csv”. Đây sẽ là tệp nơi chúng tôi lưu trữ dữ liệu của mình. Chúng tôi tạo một đối tượng nhà văn với tệp đó sẽ cho phép chúng tôi nối thêm dữ liệu. Sau khi yêu cầu người dùng cung cấp thông tin của họ, chúng tôi kiểm tra xem cả hai mật khẩu đã nhập có giống nhau không và thêm người dùng bằng đối tượng nhà văn mà chúng tôi đã tạo hoặc chúng tôi cho người dùng biết rằng đã xảy ra sự cố. Vui lòng gọi chức năng này và dùng thử. Bạn sẽ thấy tệp được tạo sau lần thử đầu tiên

Xử lý đăng nhập của người dùng Công việc thứ hai mà chúng ta cần thiết kế là khả năng đăng nhập của người dùng. Hãy xem làm thế nào để làm điều đó. 23. # hỏi thông tin người dùng và trả về true để đăng nhập hoặc false nếu thông tin không chính xác 24. người dùng đăng nhập def[ ]. 25. print["Để đăng nhập, vui lòng nhập thông tin của bạn. "] 26. email = input["E-mail. "] 27. mật khẩu = đầu vào ["Mật khẩu. "] 29. clear_output[] 31. với open["người dùng. csv", mode="r"] dưới dạng f. 32. người đọc = csv. người đọc[f, dấu phân cách=","] 34. cho hàng trong đầu đọc. 35. nếu hàng == [email, mật khẩu]. 36. print["Bạn đã đăng nhập. "] 37. trả về Đúng 39. print["Đã xảy ra lỗi, hãy thử lại. "] 40. return False Trong chức năng đăng nhập người dùng, chúng tôi yêu cầu người dùng nhập thông tin của họ. Sau đó, chúng tôi mở tệp lưu trữ thông tin người dùng ở chế độ chỉ đọc. Một đối tượng trình đọc được tạo bằng thư viện CSV và chúng tôi lặp qua từng hàng dữ liệu trên dòng 34. Mỗi hàng chúng tôi đọc ở dạng danh sách có hai mục. Mục đầu tiên luôn là e-mail và mục thứ hai là mật khẩu. Trên dòng 35, chúng tôi đang kiểm tra thông tin hàng đối với 146

Chương 6

Thu thập dữ liệu và tệp

một danh sách tạm thời chứa đầy thông tin mà người dùng nhập vào. Nếu dữ liệu khớp, chúng tôi đăng nhập chúng và trả về True; . Hãy thử gọi chức năng này sau khi đăng ký

Lưu ý Tệp được lưu trữ trong cùng thư mục với tệp sổ ghi chép

Tạo Vòng lặp chính Đây là nơi điều kỳ diệu xảy ra. Cho đến nay, chúng tôi đã tạo hai chức năng chính của chương trình, đăng ký và ghi nhật ký người dùng. Vòng lặp chính này sẽ xử lý hệ thống menu và nội dung hiển thị dựa trên việc người dùng có đăng nhập hay không. Hãy tiếp tục và hoàn thành chương trình này. 42. # biến cho vòng lặp chính 43. đang hoạt động = Đúng 44. đã đăng nhập = Sai 46. # vòng lặp chính 47. trong khi hoạt động. 48. nếu đã đăng nhập. 49. in["1. Đăng xuất\n2. Thoát"] 50. khác. 51. in["1. Đăng nhập\n2. Đăng ký\n3. Thoát"] 53. choice = input["Bạn muốn làm gì?"]. thấp hơn[ ] 55. clear_output[] 57. nếu lựa chọn == "đăng ký" và log_in == Sai. 58. registerUser[ ] 59. lựa chọn elif == "đăng nhập" và log_in == Sai. 60. đã đăng nhập = người dùng đăng nhập[ ] 61. Elif lựa chọn == "thoát". 62. hoạt động = Sai 63. print["Cảm ơn vì đã sử dụng phần mềm của chúng tôi. "] 64. lựa chọn elif == "đăng xuất" và log_in == Đúng. 65. đã đăng nhập = Sai 66. print["Bạn đã đăng xuất. "] 67. khác. 68. print["Xin lỗi, vui lòng thử lại. "] 147

Chương 6

Thu thập dữ liệu và tệp

Đi trước và chạy tế bào. Trước khi vòng lặp bắt đầu, chúng tôi xác định một vài biến cho chương trình. Các biến này sẽ theo dõi người dùng đã đăng nhập và liệu chương trình có tiếp tục chạy hay không. Sau đó, chúng tôi vào vòng lặp chính và hiển thị menu thích hợp, tùy thuộc vào người dùng đang đăng nhập. Vì người dùng chưa bao giờ đăng nhập khi chương trình bắt đầu, menu thứ hai sẽ được hiển thị. Sau đó, chúng tôi hỏi người dùng xem họ muốn làm gì bằng cách sử dụng phương thức input[]. Phần tiếp theo là nơi diễn ra logic của hệ thống menu của chúng tôi. Tùy thuộc vào sự lựa chọn của người dùng, chúng tôi thực hiện một hành động cụ thể. Chúng tôi đã tạo ra nó để người dùng chỉ có thể đăng nhập hoặc đăng ký nếu họ chưa đăng nhập. Tương tự như vậy, họ chỉ có thể đăng xuất nếu họ đã đăng nhập. Nếu họ chọn đăng nhập hoặc đăng ký, chúng tôi gọi các chức năng tương ứng để thực hiện các hoạt động của họ. Để đăng nhập người dùng, hãy nhớ rằng hàm trả về Đúng hoặc Sai, sau đó chúng ta đặt biến log_in bằng. Nếu người dùng quyết định thoát, chúng tôi đặt biến hoạt động của mình thành Sai và thoát khỏi chương trình. Cho đến lúc đó, chương trình sẽ liên tục hiển thị menu thích hợp dựa trên người dùng đang đăng nhập. Nếu họ chọn bất kỳ thứ gì ngoài các tùy chọn được bao gồm, chúng tôi sẽ hiển thị thông báo lỗi của mình

Hôm nay, chúng tôi đã có thể hiểu logic đằng sau quy trình đăng ký người dùng bằng cách sử dụng tệp CSV. Chúng ta sẽ sử dụng các khái niệm tương tự ở phần sau của cuốn sách này để lưu trữ dữ liệu

Tóm tắt hàng tuần Trong suốt tuần này, chúng tôi đã học về một trong những bộ sưu tập dữ liệu quan trọng hơn, từ điển. Chúng rất quan trọng khi làm việc với dữ liệu vì chúng cho phép chúng ta gán các cặp khóa-giá trị và truy xuất thông tin với tốc độ cao. Chúng tôi cũng đề cập đến một số bộ sưu tập dữ liệu khác phục vụ mục đích trong các tình huống cụ thể. Sau khi hiểu tập hợp, chúng ta có thể tìm hiểu về cách làm việc với tệp. Viết và đọc từ các tệp cho chúng tôi khả năng thêm các tính năng bổ sung vào chương trình của mình, như chúng tôi đã thấy trong dự án Friday khi chúng tôi tạo ứng dụng đăng ký người dùng. Chúng ta sẽ có thể áp dụng kiến thức này vào các chương trình mà chúng ta tạo sau này trong cuốn sách này

148

Chương 6

Thu thập dữ liệu và tệp

Giải pháp cho câu hỏi thử thách Nếu bạn không biết bảng màu nhạt là gì, hy vọng bạn đã tra cứu nó. Đó là nơi mà một từ được đánh vần tiến và lùi giống nhau, chẳng hạn như “xe đua. ” Có một vài cách khác nhau để bạn có thể nhận được câu trả lời cho câu hỏi này. Sau đây là một ví dụ về một giải pháp đơn giản và rõ ràng cho vấn đề. >>> def palindrom [từ]. >>> trả về Đúng nếu từ == từ [. -1] other Sai Hãy nhớ rằng chúng ta đã đề cập đến các toán tử bậc ba trong chương trước, cho phép chúng ta viết câu lệnh điều kiện một dòng. Nếu bạn đã viết ra toàn bộ câu lệnh if other mà vẫn có thể đạt được kết quả tương tự, thì bạn đã làm rất tốt. Trong tương lai, bạn nên bắt đầu cố gắng hiểu cách cô đọng mã của mình hơn nữa để được tối ưu hóa đúng cách

Thử thách hàng tuần Để kiểm tra kỹ năng của bạn, hãy thử những thử thách này. 1. Thay đổi mật khẩu. Thêm một chức năng gọi là “changePassword” vào dự án từ thứ Sáu sẽ cho phép người dùng thay đổi mật khẩu khi đăng nhập. 2. Đồ ăn yêu thích. Viết một chương trình mới sẽ hỏi người dùng món ăn yêu thích của họ là gì. Lưu câu trả lời vào tệp CSV có tên “favorite_ food. csv”. Sau khi trả lời, hiển thị bảng kết quả đã kiểm tra. Ví dụ về bảng. Đồ ăn yêu thích?

# phiếu bầu

Thổ Nhĩ Kỳ

Rau xà lách

149

CHƯƠNG 7

Lập trình hướng đối tượng Nhiều ngôn ngữ được gọi là ngôn ngữ lập trình hướng đối tượng [OOP]. Python, JavaScript, Java và C++ chỉ là một vài cái tên sử dụng OOP. Trong suốt tuần này, chúng ta sẽ bắt đầu hiểu OOP là gì, tại sao nó lại hữu ích và cách triển khai nó trong một chương trình. Trong Python [và hầu hết các ngôn ngữ], chúng tôi tạo các đối tượng thông qua các lớp mà chúng tôi xây dựng. Bạn có thể nghĩ về một lớp học như một kế hoạch chi tiết về cách một đối tượng được tạo ra. Lấy một trò chơi điện tử bắn súng góc nhìn thứ nhất làm ví dụ. Tất cả người chơi, phương tiện và vũ khí đều là đối tượng. Có thể có năm người mỗi người trong hai đội, nhưng mỗi người trong số họ được tạo ra từ cùng một bản thiết kế. Họ đều có những đặc điểm giống nhau như cân nặng, chiều cao, màu tóc, v.v. Thay vì viết cùng một dòng mã cho mười người khác nhau, bạn viết một bản thiết kế duy nhất và tạo từng người từ bản thiết kế đó. Điều này cô đọng mã và làm cho các chương trình dễ quản lý và bảo trì hơn. Vào cuối tuần, chúng ta sẽ cùng nhau xây dựng một trò chơi Blackjack hoàn chỉnh và xem sức mạnh của các lớp Python. Tổng quan •

Tìm hiểu cơ bản về lập trình hướng đối tượng

•

Những gì và làm thế nào để sử dụng các thuộc tính [các biến trong một lớp]

•

Cái gì và làm thế nào để sử dụng các phương thức [các hàm trong một lớp]

•

Hiểu những điều cơ bản về kế thừa [lớp cha hoặc lớp cơ sở]

•

Tạo Blackjack với các lớp

151

Chương 7

Lập trình hướng đối tượng

CÂU HỎI THỬ THÁCH Kết quả của đoạn mã sau là gì? . 4, 8. 8, "Q". 10, "ACE". 11 } >>> card = ["Q", "Hearts"] >>>> print["{ }". định dạng[giá trị[ thẻ[ 0 ] ] ] ]

Thứ hai. Tạo và Khởi tạo một Lớp Tất cả các đối tượng trong Python được tạo từ các lớp. Điểm của OOP là sử dụng lại cùng một mã trong khi tạo sự linh hoạt để tạo từng đối tượng với các tính năng riêng của chúng. Hôm nay, chúng ta sẽ tìm hiểu các thuật ngữ và các giai đoạn của OOP, cũng như cách viết lớp đầu tiên của chúng ta. Để theo dõi nội dung của ngày hôm nay, hãy mở Jupyter Notebook từ thư mục “python_bootcamp” của chúng tôi. Sau khi mở, hãy tạo một tệp mới và đổi tên thành “Week_07. ” Tiếp theo, tạo ô đánh dấu đầu tiên có tiêu đề cho biết. “Tạo và khởi tạo một lớp. ” Chúng tôi sẽ bắt đầu làm việc bên dưới tế bào đó

Đối tượng là gì? . , xung quanh bạn ngay bây giờ. Trong lập trình, tất cả những thứ này sẽ được tham chiếu dưới dạng các đối tượng. Ngay cả những người sẽ được tham chiếu như các đối tượng. Điều này là do tất cả các đối tượng đến từ một bản thiết kế cụ thể. Trong Python, những bản thiết kế đó được gọi là các lớp. Ví dụ, hãy lấy một chiếc ô tô. Tất cả các ô tô đều có các tính năng tương tự và có thể được tạo từ một mẫu. Mỗi chiếc xe thường sẽ có bánh xe, màu sắc, kiểu dáng, kiểu dáng, năm, số VIN, v.v. Những lớp học cho phép chúng tôi làm là xây dựng một bản thiết kế có tất cả các tính năng này bên trong nó và tạo ra những chiếc ô tô khác nhau từ nó. Điều này sẽ giảm bớt mã chúng tôi phải viết và cung cấp cho chúng tôi khả năng cung cấp cho bất kỳ chiếc xe nào chúng tôi tạo ra các đặc điểm cá nhân dành riêng cho đối tượng đó. Hình 7-1 minh họa khái niệm tạo nhiều đối tượng từ cùng một lớp

152

Chương 7

Lập trình hướng đối tượng

Hình 7-1. Tạo ba chiếc xe tương tự từ cùng một bản thiết kế lớp

Các giai đoạn OOP Có hai giai đoạn khi sử dụng các lớp. Giai đoạn đầu tiên là định nghĩa lớp. Giống như định nghĩa hàm, giai đoạn này là nơi bạn viết kế hoạch chi tiết sẽ được sử dụng khi được gọi. Giai đoạn thứ hai được gọi là khởi tạo. Đó là quá trình tạo một đối tượng từ định nghĩa lớp. Sau khi một đối tượng được khởi tạo, nó được gọi là một thể hiện. Bạn có thể có nhiều thể hiện từ một định nghĩa lớp duy nhất. Hãy bắt đầu xem cách định nghĩa một lớp và tạo một thể hiện

Tạo Lớp Bước đầu tiên trong việc sử dụng các lớp là tạo định nghĩa lớp hoặc “bản thiết kế. ” Để tạo một lớp mới, cú pháp giống như các hàm, nhưng bạn sử dụng từ khóa class thay vì def. Trong phần thụt vào của khối lớp này, chúng ta sẽ viết kế hoạch chi tiết cho các thuộc tính và phương thức lớp của chúng ta. Tuy nhiên, đừng lo lắng về những điều đó bây giờ; . Hiện tại, chúng ta sẽ chỉ sử dụng từ khóa pass. Hãy xem một ví dụ

153

Chương 7

Lập trình hướng đối tượng

# tạo Xe hạng nhất của bạn[ ]. pass # chỉ cần sử dụng làm trình giữ chỗ cho đến khi chúng tôi thêm nhiều mã hơn vào ngày mai Hãy tiếp tục và chạy ô. Sẽ không có gì xảy ra, nhưng điều đó tốt vì nó có nghĩa là nó đã hoạt động. Tất cả các lớp sẽ được tạo với cùng một cấu trúc, ngoại trừ thay vì viết pass, chúng tôi sẽ điền vào khối bằng mã cung cấp các tính năng cho đối tượng

Lưu ý Trong Python, các kiểu dữ liệu cũng là các lớp tại cơ sở của chúng. In ra loại số nguyên dẫn đến

Tạo một phiên bản Bây giờ chúng ta đã biết cách tạo định nghĩa lớp, chúng ta có thể bắt đầu hiểu cách tạo một phiên bản của một đối tượng. Giống như lưu kiểu dữ liệu vào tên biến ta sử dụng cú pháp tương tự, chỉ khác sau tên lớp ta sử dụng dấu ngoặc đơn. Chúng ta sẽ xem xét những dấu ngoặc đơn này được dùng để làm gì trong bài học ngày mai. Hãy cùng kiểm tra nào. # khởi tạo một đối tượng từ một lớp lớp Car[]. # dấu ngoặc là tùy chọn ở đây pass ford = Car[ ] # tạo một thể hiện của lớp Car và lưu trữ vào biến ford print[ford] Hãy tiếp tục và chạy ô. Bạn sẽ nhận được kết quả như “”. Đây là mô tả lớp mà thể hiện được tạo từ “Xe hơi” và vị trí trong bộ nhớ mà chính lớp đó được lưu trữ “0x0332DB. ” Chúng ta đã tạo thành công một thể hiện của đối tượng Xe hơi và lưu trữ nó vào biến “ford” của chúng ta

Tạo nhiều thể hiện Hãy nhớ rằng bạn có thể tạo bao nhiêu thể hiện tùy thích từ mỗi lớp; . Hãy tạo hai thể hiện từ lớp của chúng tôi. 154

Chương 7

Lập trình hướng đối tượng

# khởi tạo nhiều đối tượng từ cùng một lớp class Car[]. pass ford = Car[ ] subaru = Car[ ] # tạo một đối tượng khác từ lớp car print[ hash[ford] ] print[ hash[subaru] ] # hash xuất ra một biểu diễn số của vị trí trong bộ nhớ cho biến Tiến lên . Khi chúng tôi xuất các giá trị băm cho các biến của mình, chúng tôi nhận được hai số khác nhau. Những con số này là một đại diện số của vị trí của các biến trong bộ nhớ. Có nghĩa là mặc dù hai biến được tạo từ cùng một nguồn nhưng chúng được lưu trữ dưới dạng các thực thể riêng biệt trong chương trình. Đây là vẻ đẹp của các đối tượng, vì mỗi trường hợp có thể có các đặc điểm riêng

THỨ HAI BÀI TẬP 1. Loài vật. Tạo một lớp có tên là “Động vật” và tạo hai thể hiện từ nó. Sử dụng hai biến có tên “sư tử” và “hổ. ” 2. Giải quyết vấn đề. Có gì sai với mã sau đây? . >>> pass >>> school_bus = Xe buýt[ ]

Hôm nay là bước đầu tiên vào thế giới lập trình hướng đối tượng. Để xây dựng các đối tượng trong Python, trước tiên chúng ta phải tạo các định nghĩa lớp, còn được gọi là bản thiết kế. Từ đó, chúng ta có thể tạo một hoặc nhiều thể hiện từ lớp đó. Quá trình này được gọi là khởi tạo. Ngày mai chúng ta sẽ xem cách chúng ta có thể cung cấp các tính năng cho từng phiên bản

155

Chương 7

Lập trình hướng đối tượng

Thứ ba. Các thuộc tính Hôm qua chúng ta đã thấy cách tạo một định nghĩa lớp. Hôm nay, chúng ta sẽ bắt đầu hiểu cách cung cấp các tính năng được cá nhân hóa, được gọi là thuộc tính, cho các lớp và các thể hiện của chúng. Các thuộc tính chỉ là các biến được định nghĩa trong một lớp, không có gì hơn thế. Nếu bạn nghe ai đó nói về các thuộc tính, bạn sẽ biết ngay rằng họ đang nói về các lớp. Thuộc tính là cách chúng tôi lưu trữ thông tin cá nhân cho từng đối tượng. Hãy nghĩ về một thuộc tính như một nguồn thông tin cho một đối tượng. Đối với ô tô, một thuộc tính có thể là màu sắc, số bánh xe, số chỗ ngồi, kích thước động cơ, v.v. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_07” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Thuộc tính. ”

Khai báo và Truy cập các thuộc tính Giống như các biến, chúng ta khai báo các thuộc tính có tên và giá trị; . Chúng ta đã nói về phạm vi trong một tuần trước; . # cách định nghĩa một thuộc tính lớp class Car[]. sound = "beep" # tất cả các đối tượng ô tô sẽ có thuộc tính âm thanh này và giá trị' của nó color = "red" # tất cả các đối tượng ô tô sẽ có thuộc tính màu này và giá trị 'của nó ford = Car[ ] print[ford. color] # được gọi là 'cú pháp dấu chấm' Hãy tiếp tục và chạy ô. Đầu ra sẽ cho kết quả là màu đỏ. Khi chúng ta khởi tạo biến ford từ lớp Car, nó được tạo với hai thuộc tính. Các thuộc tính này được đặt tự động trong định nghĩa lớp, vì vậy mọi phiên bản được tạo từ lớp Xe hơi sẽ được cung cấp âm thanh “bíp” và màu “đỏ. ” Chúng ta sẽ xem làm thế nào chúng ta có thể thay đổi điều này sau. Để truy cập thuộc tính của đối tượng, bạn sử dụng cú pháp dấu chấm. Bạn bắt đầu bằng cách viết tên của phiên bản, theo sau là dấu chấm và thuộc tính bạn muốn truy cập. Tất cả các lớp đều sử dụng cú pháp dấu chấm tương tự này để truy cập các thuộc tính và phương thức [thêm về các phương thức vào ngày mai]. 156

Chương 7

Lập trình hướng đối tượng

Thay đổi thuộc tính của đối tượng Không phải tất cả các đối tượng bạn tạo sẽ có các đặc điểm giống nhau, vì vậy bạn cần có khả năng thay đổi giá trị thuộc tính. Để làm điều này, bạn sẽ cần sử dụng cú pháp dấu chấm. # thay đổi giá trị của lớp thuộc tính Car[]. sound = "beep" color = "red" ford = Car[ ] print[ford. âm thanh] # sẽ phát ra tiếng 'bíp' ford. sound = "honk" # từ bây giờ giá trị của fords sound là honk, điều này không ảnh hưởng đến các trường hợp khác print[ford. âm thanh] # sẽ phát ra tiếng 'honk' Hãy tiếp tục và chạy ô. Bạn sẽ nhận thấy rằng chúng tôi sẽ xuất thuộc tính âm thanh của phiên bản ford trước và sau khi chúng tôi thay đổi nó. Sử dụng cú pháp dấu chấm, chúng tôi có thể gán giá trị mới cho thuộc tính âm thanh. Điều này không khác gì thay đổi giá trị của một biến. Thuộc tính âm thanh của đối tượng ford bây giờ sẽ là “honk” cho đến khi chúng tôi quyết định thay đổi nó

Sử dụng Phương thức __init__[ ] Cho đến nay, chúng ta đã tạo các lớp ở dạng rất cơ bản. Khi muốn khởi tạo một đối tượng với các thuộc tính cụ thể, bạn cần sử dụng phương thức khởi tạo [init]. Bất cứ khi nào một thể hiện được tạo, phương thức init được gọi ngay lập tức. Bạn có thể sử dụng phương thức này để khởi tạo các đối tượng với các giá trị thuộc tính khác nhau khi tạo. Điều này cho phép chúng tôi dễ dàng tạo các thể hiện của lớp với các thuộc tính được cá nhân hóa. Bây giờ, chúng ta sẽ xem xét các phương thức vào ngày mai, vì vậy đừng lo lắng quá nhiều về cú pháp, mà hãy hiểu rõ hơn về cách sử dụng phương thức này. Khai báo cho phương thức này có hai dấu gạch dưới trước và sau từ init. Nó cũng bao gồm từ khóa “self” [thêm về điều này trong phần tiếp theo] bên trong dấu ngoặc đơn như một tham số bắt buộc. Đối với ví dụ này, chúng tôi sẽ tạo một phiên bản có màu được xác định khi khởi tạo. Hãy tiếp tục và thử nó

157

Chương 7

Lập trình hướng đối tượng

1. # sử dụng phương thức init để cung cấp các thuộc tính được cá nhân hóa cho các cá thể khi tạo 3. hạng Xe[ ]. 4. def __init__[bản thân, màu sắc]. 5. bản thân. color = color # đặt thuộc tính color thành giá trị được truyền trong 7. ford = Car["blue"] # khởi tạo một lớp Car với màu xanh dương 9. in [ford. color] Hãy tiếp tục và chạy ô. Chúng ta sẽ nhận được kết quả đầu ra là “blue”. Khi chúng ta tạo phiên bản ford, nó được khởi tạo với màu thuộc tính được đặt thành màu xanh lam. Tất cả điều này xảy ra trên dòng thứ 5. Khi chúng ta khai báo biến ford được khởi tạo, nó đã chuyển đối số “blue” vào phương thức khởi tạo ngay lập tức. Đối số self bị bỏ qua và "màu xanh lam" được chuyển vào tham số màu. Trong phương thức init là nơi chúng ta đặt thuộc tính màu của mình cho đối số vừa được truyền vào. Do đó giá trị “màu xanh. ” Hãy nhớ rằng các tham số cho phương thức này hoạt động giống như các hàm và cần phải theo đúng thứ tự

Từ khóa “self” Từ khóa self là một tham chiếu đến thể hiện hiện tại của lớp và được sử dụng để truy cập các biến và phương thức được liên kết với thể hiện đó. Hãy nghĩ về một đội bóng mà bạn chưa từng xem trước đây. Làm thế nào để bạn phân biệt từng người chơi với người tiếp theo? . Mặc dù mỗi người chơi là một người có các tính năng khác nhau, nhưng bạn vẫn dễ dàng chọn ra bất kỳ ai trong số họ dựa trên số của họ. Trong Python, về cơ bản, đó là cách các đối tượng được tạo từ cùng một nguồn được xác định. Trong ô trước đó, chúng tôi đã in màu thuộc tính từ phiên bản ford. Lý do Python biết nơi truy cập giá trị này, cụ thể là cho ford, là vì chúng tôi đã sử dụng từ khóa self. Chúng tôi không cần nó cho các lớp cơ bản vì những thuộc tính đó có thể truy cập được trên toàn cầu, điều này sẽ được đề cập sau ngày hôm nay. Hiện tại, chỉ cần biết rằng khi bạn muốn khởi tạo một đối tượng với các thuộc tính được cá nhân hóa, bạn cần khai báo phương thức init và sử dụng từ khóa self để lưu từng giá trị thuộc tính

158

Chương 7

Lập trình hướng đối tượng

Khởi tạo nhiều đối tượng với __init__[ ] Để thực sự hiểu cách thức hoạt động của phương thức init, hãy khởi tạo một vài đối tượng với hai thuộc tính có giá trị khác nhau. # định nghĩa các giá trị khác nhau cho nhiều thể hiện class Car[]. def __init__[bản thân, màu sắc, năm]. bản thân. color = color # đặt thuộc tính color thành giá trị được chuyển vào self. year = year ford = Car["blue", 2016] # tạo đối tượng ô tô có màu xanh lam và năm 2016 subaru = Car["red", 2018] # tạo đối tượng ô tô có màu đỏ và in năm 2018 . màu sắc, ford. năm] in[subaru. màu sắc, subaru. năm] Hãy tiếp tục và chạy tế bào. Hai câu lệnh in ở dưới cùng sẽ xuất các thuộc tính của từng phiên bản. Khi chúng tôi khởi tạo các đối tượng ford và subaru, chúng tôi đã gán cho chúng các giá trị khác nhau cho từng thuộc tính tương ứng của chúng. Đây là vẻ đẹp của OOP. Chúng tôi có thể xây dựng hai đối tượng khác nhau từ cùng một nguồn chỉ bằng hai dòng. Ngay cả khi bản thân lớp dài hàng nghìn dòng, để tạo mười thể hiện khác nhau sẽ chỉ mất mười dòng mã

Thuộc tính toàn cầu vs. Thuộc tính phiên bản Nếu không biết, bạn đã sử dụng cả thuộc tính có thể truy cập toàn cầu và thuộc tính có thể truy cập phiên bản. Các thuộc tính toàn cục có thể được tham chiếu trực tiếp bởi lớp và tất cả các thể hiện của nó, trong khi các thuộc tính thể hiện [được định nghĩa trong phương thức init] chỉ có thể được truy cập bởi các thể hiện của lớp. Nếu một thuộc tính được khai báo bên trong một lớp, nhưng không nằm trong phương thức init, thì nó được gọi là thuộc tính toàn cục. Bất kỳ thuộc tính nào được khai báo trong phương thức init bằng từ khóa self đều là thuộc tính thể hiện. Hãy xem một ví dụ

159

Chương 7

Lập trình hướng đối tượng

1. # sử dụng và truy cập thuộc tính lớp toàn cầu 3. hạng Xe[ ]. 4. sound = "beep" # thuộc tính chung, có thể truy cập thông qua chính lớp đó 6. def __init__[bản thân, màu sắc]. 7. bản thân. color = "blue" # thuộc tính cụ thể của cá thể, không thể truy cập thông qua chính lớp đó 9. in [Xe hơi. âm thanh] 11. # in[Xe hơi. color] sẽ không hoạt động, vì màu sắc chỉ khả dụng đối với các thể hiện của lớp Xe hơi, không phải chính lớp đó 13. ford = Xe["xanh"] 15. in [ford. âm thanh ford. color] # màu sẽ hoạt động vì đây là một phiên bản Hãy tiếp tục và chạy ô. Trên dòng thứ 6, chúng tôi in ra âm thanh “bíp” bằng cách truy cập trực tiếp vào nó thông qua bản thiết kế lớp với cú pháp dấu chấm. Bạn làm điều này bằng cách sử dụng tên của lớp, thay vì tên của một thể hiện. Chúng tôi có thể làm điều này vì thuộc tính âm thanh được thiết lập là thuộc tính có thể truy cập toàn cầu. Toàn bộ dòng thứ 7 được nhận xét vì nó sẽ tạo ra lỗi do thuộc tính màu được khai báo trong phương thức init và chỉ có thể truy cập được đối với các thể hiện chứ không phải chính lớp đó. Cuối cùng, ở dòng thứ 9, sau khi khởi tạo đối tượng ford, chúng ta in ra cả thuộc tính âm thanh và màu sắc. Tất cả các phiên bản của lớp đều có quyền truy cập vào các thuộc tính cấp độ toàn cầu và phiên bản, đó là lý do tại sao chúng tôi có thể xuất âm thanh. Tuy nhiên, điều bạn phải ghi nhớ là chúng tôi không thể cung cấp cho phiên bản ford một giá trị được cá nhân hóa cho thuộc tính âm thanh. Chỉ khi các thuộc tính được khai báo trong phương thức init, chúng ta mới có thể cung cấp cho các cá thể các giá trị cá nhân khi khởi tạo. Hiện tại, để cung cấp cho ford một giá trị khác cho thuộc tính âm thanh, chúng tôi sẽ phải thay đổi giá trị đó sau khi khởi tạo

THỨ BA BÀI TẬP 1. Chó. Tạo một lớp Dog có một thuộc tính toàn cầu và hai thuộc tính cấp thể hiện. Thuộc tính chung phải là “loài” với giá trị “Canine. ” Hai thuộc tính cá thể phải là “tên” và “giống. ” Sau đó, khởi tạo hai đối tượng con chó, một chú Husky tên là Sammi và một Phòng thí nghiệm sô cô la tên là Casey. 2. Đầu vào của người dùng. Tạo một lớp Người có một thuộc tính cấp thể hiện duy nhất là “tên. ” Yêu cầu người dùng nhập tên của họ và tạo một thể hiện của lớp Person với tên họ đã nhập. Sau đó in ra tên của họ. 160

Chương 7

Lập trình hướng đối tượng

Hôm nay chúng ta đã học tất cả về các thuộc tính và cách chúng ta có thể cung cấp cho các lớp các biến được cá nhân hóa. Việc sử dụng phương thức khởi tạo và từ khóa self cho phép chúng ta khai báo các thuộc tính tại thời điểm khởi tạo. Cuối cùng, sự khác biệt giữa các thuộc tính cấp độ toàn cầu và cấp độ phiên bản là chìa khóa. Các thuộc tính đó trong phương thức khởi tạo không thể được truy cập trực tiếp thông qua lớp mà phải thông qua các thể hiện của lớp

Thứ Tư. Phương pháp Khi bạn nghĩ về các đối tượng, bạn liên kết các tính năng và hành động nhất định với chúng. Lấy một chiếc ô tô chẳng hạn. Chúng sẽ có các thuộc tính như màu sắc và bánh xe nhưng cũng có các hành động, chẳng hạn như dừng, tăng tốc, rẽ, v.v. Trong các lớp, những hành động này được gọi là phương thức. Các phương thức về cơ bản là các hàm nằm trong các lớp. Nếu bạn nghe ai đó nói về phương pháp, bạn sẽ biết ngay rằng họ đang nói về OOP. Hôm nay, chúng ta sẽ xem cách chúng ta có thể khai báo các phương thức cho các lớp của mình, cách gọi chúng và tại sao chúng lại hữu ích. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ ghi chép của chúng ta “Week_07” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Phương thức. ”

Xác định và Gọi một Phương thức Xác định một phương thức cũng giống như xác định một hàm; . Khi khai báo một phương thức mà bạn định truy cập thông qua các thể hiện, bạn phải sử dụng tham số self trong định nghĩa. Không có từ khóa self, phương thức chỉ có thể được truy cập bởi chính lớp đó. Để gọi một phương thức, bạn sử dụng cú pháp dấu chấm. Vì các phương thức chỉ là các hàm, bạn phải gọi chúng bằng dấu ngoặc đơn sau tên của thể hiện. # định nghĩa và gọi phương thức lớp đầu tiên của chúng ta là lớp Dog[ ]. def makeSound[tự]. print["bark"] sam = Chó[ ] sam. tạo âm thanh[ ]

161

Chương 7

Lập trình hướng đối tượng

Đi trước và chạy tế bào. Chúng tôi sẽ nhận được "vỏ cây" như đầu ra của chúng tôi. Khi chúng tôi tạo định nghĩa lớp, nó bao gồm phương thức makeSound trong bản thiết kế. Khi chúng tôi đã tạo một thể hiện của lớp Dog, chúng tôi có thể truy cập phương thức bằng cách gọi nó bằng cú pháp dấu chấm. Bạn có thể có bao nhiêu phương thức tùy thích trong một lớp

Truy cập các thuộc tính của lớp trong Phương thức Trong các phương thức bạn tạo, bạn sẽ thường cần truy cập vào các thuộc tính được xác định trong lớp. Để làm như vậy, bạn cần sử dụng từ khóa self để truy cập thuộc tính. Hãy nhớ rằng bản thân có liên quan đến thể hiện truy cập lớp. Khi chúng tôi tạo nhiều phiên bản, bản thân là thứ cho phép chương trình hiểu thuộc tính âm thanh nào sẽ trả về. Điều này đúng ngay cả đối với các thuộc tính toàn cầu. Hãy xem một ví dụ. # sử dụng từ khóa self để truy cập các thuộc tính trong các phương thức của lớp class Dog[ ]. sound = "bark" def makeSound[self]. in [tự. sound] # tự bắt buộc để truy cập các thuộc tính được xác định trong lớp sam = Dog[ ] sam. makeSound[] Tiếp tục và chạy ô. Chúng tôi sẽ nhận được kết quả đầu ra là "bark" một lần nữa, ngoại trừ lần này, đó là do chúng tôi đã truy cập thuộc tính âm thanh được khai báo trong lớp. Bất cứ khi nào bạn cần tham chiếu một thuộc tính bằng self, bạn phải bao gồm self trong các tham số của phương thức

Phạm vi phương thức Giống như các thuộc tính toàn cục, bạn có thể có các phương thức có thể truy cập thông qua chính lớp đó chứ không phải là một thể hiện của lớp. Đây cũng có thể được gọi là phương pháp tĩnh. Chúng không thể truy cập được bởi các thể hiện của lớp. Tùy thuộc vào lớp tòa nhà của bạn, có thể hữu ích khi có một phương thức chỉ có thể truy cập thông qua lớp chứ không phải các phiên bản. Hãy xem một ví dụ

162

Chương 7

Lập trình hướng đối tượng

1. # bạn hiểu phương thức nào có thể truy cập thông qua chính lớp đó và các thể hiện của lớp 3. lớp Chó[ ]. 4. âm thanh = "sủa" 6. def makeSound[tự]. 7. in [tự. âm thanh] 9. def printInfo[]. 10. print["Tôi là một con chó. "] 12. Chú chó. printInfo[] # có thể chạy phương thức printInfo vì nó không bao gồm tham số 14. # Chú chó. makeSound[ ] would produce error, self is in reference to instances only 16. sam = Con chó[] 18. sam. makeSound[ ] # có thể truy cập, bản thân có thể tham khảo phiên bản của sam 20. # sam. printInfo[ ] sẽ tạo ra lỗi, các trường hợp yêu cầu tham số self để truy cập các phương thức Hãy tiếp tục và chạy ô. Lần này chúng ta đã định nghĩa hai phương thức trong lớp Dog. Một phương thức có self trong tham số, trong khi phương thức kia thì không. Phương thức không có tham số self có thể được truy cập thông qua chính lớp đó, đó là lý do tại sao dòng 8 xuất ra “I am a dog. ”. Dòng thứ 9 được nhận xét vì makeSound chỉ có thể được truy cập bởi các thể hiện của lớp Dog của chúng tôi, không phải chính lớp đó. Cuối cùng, chúng ta có thể thấy rằng dòng thứ 12 cũng được chú thích vì các phương thức không được định nghĩa với tham số self không thể được truy cập bởi các thể hiện của lớp. Nếu không, chúng tôi sẽ tạo ra một lỗi. Đây là tầm quan trọng của từ khóa self

Truyền đối số vào phương thức Các phương thức hoạt động giống như các hàm, nơi bạn có thể truyền đối số vào phương thức sẽ được sử dụng. Khi các đối số này được truyền vào, chúng không cần được tham chiếu với tham số self, vì chúng không phải là thuộc tính, mà là các biến tạm thời mà phương thức có thể sử dụng

163

Chương 7

Lập trình hướng đối tượng

# viết các phương thức chấp nhận tham số lớp Dog[ ]. def showAge[bản thân, tuổi]. print[age] # không cần self, age đang tham chiếu đến tham số chứ không phải thuộc tính sam = Dog[ ] sam. showAge[ 6 ] # chuyển số nguyên 6 làm đối số cho phương thức showAge Tiếp tục và chạy ô. Chúng tôi sẽ nhận được đầu ra là 6. Sau khi xác định một thể hiện của Dog, chúng tôi đã gọi phương thức showAge và chuyển đối số của số nguyên 6 vào phương thức. Phương pháp sau đó có thể in ra tuổi. Chúng tôi không cần phải nói “tự. age” bởi vì self được tham chiếu đến các thuộc tính của lớp, không phải các tham số

Sử dụng Setters và Getters Trong lập trình có một khái niệm gọi là setters và getters. Chúng là các phương thức mà bạn tạo để khai báo lại các giá trị thuộc tính và trả về các giá trị thuộc tính. Chúng tôi đã thấy cách chúng tôi có thể thay đổi các giá trị thuộc tính bằng cách truy cập trực tiếp vào chúng; . Thực hành tốt là tạo một phương thức sẽ thay đổi giá trị thuộc tính cho bạn và gọi phương thức đó khi bạn cần đặt giá trị mới. Điều tương tự cũng xảy ra khi bạn muốn truy cập một giá trị thuộc tính nhất định; . Điều này mang lại cho chúng tôi một cách an toàn hơn để truy cập các thuộc tính của phiên bản. Hãy xem làm thế nào chúng ta có thể. 1. # sử dụng các phương thức để đặt hoặc trả về các giá trị thuộc tính, thực hành lập trình phù hợp 3. lớp Chó[ ]. 4. name = ' ' # thường sử dụng phương thức init để khai báo, đây là mục đích thử nghiệm 6. def setName[self, new_name]. 7. bản thân. name = new_name # khai báo giá trị mới cho thuộc tính name 9. def getName[self]. 10. tự trở về. tên # trả về giá trị của thuộc tính tên 11. sam = Con chó[] 13. sam. setName["Sammi"] 15. in [sam. getName[ ] ] # in giá trị trả về của self. tên 164

Chương 7

Lập trình hướng đối tượng

Đi trước và chạy tế bào. Chúng tôi đã tạo hai phương thức, một setter và một getter. Các phương thức này thường sẽ có từ khóa tương ứng là “set” và “get” ở đầu tên phương thức. Trên dòng 4, chúng tôi xác định một trình thiết lập để nhận tham số new_name và thay đổi tên thuộc tính thành giá trị được truyền vào. Đây là cách thực hành tốt hơn để thay đổi giá trị thuộc tính. Ở dòng thứ 6, chúng ta tạo một phương thức getter chỉ trả về giá trị của thuộc tính name. Đây là cách thực hành tốt hơn để truy xuất giá trị thuộc tính. Dòng 9 và 10 gọi cả hai phương thức để thay đổi và in ra giá trị trả về

Tăng thuộc tính bằng Phương thức Giống như setters, khi bạn muốn thay đổi giá trị thuộc tính bằng cách tăng hoặc giảm giá trị đó thay vì chỉ thay đổi hoàn toàn, cách tốt nhất là tạo một phương thức để hoàn thành tác vụ. # tăng/giảm giá trị thuộc tính bằng các phương thức, lớp thực hành lập trình tốt nhất Dog[ ]. tuổi = 5 def happyBirthday[self]. bản thân. tuổi += 1 samuel = Chó[ ] sam. chúc mừng sinh nhật[ ] # gọi phương thức để tăng giá trị theo một bản in[sam. age] # phương pháp hay hơn là sử dụng getters, đây là mục đích thử nghiệm Hãy tiếp tục và chạy ô. Trong ví dụ này, chúng tôi đã tạo một phương thức có tên là happyBirthday sẽ tăng tuổi của con chó lên một mỗi khi nó được gọi. Đây đơn giản là cách thực hành tốt hơn, nhưng không phải là phương pháp bắt buộc để thay đổi giá trị thuộc tính lớp

165

Chương 7

Lập trình hướng đối tượng

Các phương thức Gọi các phương thức Khi gọi một phương thức từ một phương thức khác, bạn cần sử dụng tham số self. Hãy tạo một phương thức getter và một phương thức in ra thông tin của con chó dựa trên giá trị. 1. # gọi một phương thức của lớp từ một phương thức khác 3. lớp Chó[ ]. 4. tuổi = 6 6. def getAge[self]. 7. tự trở về. 9 tuổi. def printInfo[tự]. 10. nếu tự. getAge[ ] < 10. # cần tự gọi phương thức khác cho phiên bản 11. print["Chó con. "] 13. sam = Con chó[ ] 15. sam. printInfo[] Tiếp tục và chạy ô. Chúng tôi sẽ nhận được một đầu ra của "Puppy" ở đây. Chúng tôi có thể nhận được giá trị trả về từ trình thu thập của mình nhờ cách chúng tôi tham chiếu phương thức getAge trong phương thức printInfo của chúng tôi. Đó là sử dụng từ khóa self và cú pháp dấu chấm. Điều kiện đã được chứng minh là đúng, vì giá trị được trả về là 6, vì vậy nó đã tiến hành chạy câu lệnh in trong khối

Các phương thức ma thuật Mặc dù chúng có một cái tên ngộ nghĩnh, nhưng các phương thức ma thuật là nền tảng của các lớp trong Python. Không biết, bạn đã sử dụng một phương thức khởi tạo. Tất cả các phương thức ma thuật đều có hai dấu gạch dưới trước và sau tên của chúng. Khi bạn in ra bất cứ thứ gì, bạn đang truy cập một phương thức kỳ diệu có tên là __str__. Khi bạn sử dụng các toán tử [+, -, /, ∗, ==, v.v. ], bạn đang truy cập các phương pháp ma thuật. Về cơ bản, chúng là các hàm quyết định toán tử nào và các tác vụ khác trong Python thực hiện. Đừng quá say mê chúng, vì chúng tôi sẽ không sử dụng chúng quá nhiều, nhưng tôi muốn giới thiệu với bạn về chúng. Như đã đề cập, phương thức ma thuật __str__ được gọi khi sử dụng chức năng in; . Hãy thay đổi những gì được in ra khi chúng ta in ra một lớp mà chúng ta đã tự định nghĩa. 166

Chương 7

Lập trình hướng đối tượng

# sử dụng các phương thức ma thuật lớp Dog[ ]. def __str__[bản thân]. return "This is a dog class" sam = Dog[ ] print[sam] # sẽ in kết quả trả về của phương thức ma thuật chuỗi Tiếp tục và chạy ô. Trước đây khi chúng tôi in ra một lớp, nó sẽ xuất ra tên của bản thiết kế lớp và vị trí bộ nhớ. Bây giờ, vì chúng ta đã thay đổi phương thức ma thuật __str__, nên chúng ta có thể xuất ra một câu lệnh in hoàn toàn khác. Hãy nhớ rằng phương thức __str__magic mong đợi một chuỗi được trả về, không được in. Tất cả các phương thức ma thuật yêu cầu các tham số nhất định và giá trị trả về. Vui lòng tra cứu thêm một số và thay đổi những cái khác để xem cách chúng hoạt động

THỨ TƯ BÀI TẬP 1. Loài vật. Tạo định nghĩa lớp của một loài động vật có thuộc tính loài và cả setter và getter để thay đổi hoặc truy cập giá trị thuộc tính. Tạo một thể hiện có tên là “lion” và gọi phương thức setter với đối số là “feline. ” Sau đó in ra các loài bằng cách gọi phương thức getter. 2. Đầu vào của người dùng. Tạo một lớp Người lấy tên khi khởi tạo nhưng đặt tuổi thành 0. Trong thiết lập định nghĩa lớp, một setter và getter sẽ yêu cầu người dùng nhập tuổi của họ và đặt thuộc tính tuổi thành đầu vào giá trị. Sau đó xuất thông tin dưới dạng chuỗi có định dạng là “Bạn 64 tuổi. ” Giả sử người dùng nhập 64 theo tuổi của họ

Hôm nay, chúng ta đã có thể tìm hiểu về các phương thức và cách chúng hoạt động cơ bản trong các lớp. Để truy cập các phương thức khác, chúng ta cần sử dụng tham số self. Các phương thức cung cấp cho các lớp chức năng bổ sung và được sử dụng trong hầu hết mọi lớp chúng tôi tạo. Điều này sẽ cung cấp cho tất cả các phiên bản của một lớp nhất định các chức năng giống nhau

167

Chương 7

Lập trình hướng đối tượng

thứ năm. Kế thừa Đôi khi bạn sẽ tạo các lớp có các thuộc tính hoặc phương thức tương tự nhau. Lấy một lớp Chó và Mèo làm ví dụ. Cả hai sẽ có mã, thuộc tính và phương thức gần như giống nhau. Thay vì viết cùng một mã hai lần, chúng tôi sử dụng một khái niệm gọi là kế thừa. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_07” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Kế thừa. ”

Kế thừa là gì? . Khi bạn có hai hoặc nhiều lớp sử dụng mã tương tự, bạn thường muốn thiết lập cái được gọi là “lớp cha”. ” Hai lớp sẽ kế thừa tất cả mã trong lớp cha được gọi là “các lớp con. ” Một cách tuyệt vời để nghĩ về thừa kế là cha mẹ và con cái của họ. Cha mẹ truyền lại gen cho con cái của họ, chúng được thừa hưởng và giúp xác định những đặc điểm mà đứa trẻ sẽ được sinh ra. Kế thừa hoạt động theo cùng một cách, trong đó lớp con kế thừa tất cả các thuộc tính và phương thức trong lớp cha. Thay vì viết hai lần cùng một thuộc tính và phương thức cho hai lớp, chúng ta có thể kế thừa một lớp và chỉ cần viết mã một lần

Kế thừa một Lớp Để kế thừa một lớp, chúng ta cần đặt tên của lớp mà chúng ta đang kế thừa giữa các dấu ngoặc đơn sau tên của lớp con của chúng ta. Hãy thử nó. 1. # kế thừa một lớp và truy cập phương thức kế thừa 3. lớp Thú[ ]. 4. def makeSound[tự]. 5. in ["gầm"] 7. lớp Chó [Động vật]. # kế thừa lớp Thú 8. loài = "Chó" 10. sam = Con chó[] 12. sam. makeSound[ ] # có thể truy cập thông qua kế thừa 14. sư tử = Động vật[] 16. # sư tử. loài không truy cập được, kế thừa không hoạt động ngược 168

Chương 7

Lập trình hướng đối tượng

Đi trước và chạy tế bào. Ở dòng 5, chúng ta kế thừa lớp Animal vào lớp Dog của chúng ta. Điều này mang lại cho Dog khả năng truy cập phương thức makeSound, đó là lý do tại sao ở dòng 8, chúng ta có thể sử dụng cú pháp dấu chấm để truy cập makeSound. Tuy nhiên, hãy nhớ rằng, tính kế thừa không hoạt động ngược, vì vậy Animal không có quyền truy cập vào các thuộc tính và phương thức được định nghĩa trong lớp Dog. Vì lý do này, dòng thứ 10 bị loại bỏ vì thuộc tính loài không tồn tại trong Động vật và cố gắng truy cập nó sẽ gây ra lỗi

Sử dụng Phương thức siêu[ ] Phương thức siêu được sử dụng để tạo khả năng tương thích về phía trước khi sử dụng tính kế thừa. Khi khai báo các thuộc tính được yêu cầu trong lớp cha, super được sử dụng để khởi tạo các giá trị của nó. Cú pháp của super là từ khóa super, dấu ngoặc đơn, dấu chấm, phương thức khởi tạo và bất kỳ thuộc tính nào trong dấu ngoặc đơn của lệnh gọi init. Hãy xem một ví dụ. 1. # sử dụng phương thức super[] để khai báo thuộc tính kế thừa 3. lớp Thú[ ]. 4. def __init__[bản thân, loài]. 5. bản thân. loài = loài 7. lớp Chó [Động vật]. số 8. def __init__[bản thân, loài, tên]. 9. bản thân. tên = tên 10. siêu[ ]. __init__[species] # sử dụng super để khai báo thuộc tính loài được xác định trong Động vật 12. sam = Chó["Chó", "Sammi"] 14. in [sam. loài] Hãy tiếp tục và chạy tế bào. Ở dòng 6, chúng ta khai báo thuộc tính name bằng đối số được truyền vào vì thuộc tính này chỉ được định nghĩa trong lớp Dog. Dòng 7 là nơi phương thức super được gọi để khởi tạo thuộc tính loài bởi vì nó được khai báo bên trong siêu lớp Animal. Việc sử dụng siêu ở đây giúp giảm bớt các dòng mã, điều này rõ ràng hơn khi siêu lớp yêu cầu một số thuộc tính. Khi phương thức super được gọi, giá trị thuộc tính loài của chúng ta được đặt thành đối số được truyền vào và giờ đây chúng ta có thể truy cập nó thông qua đối tượng Dog, đó là lý do tại sao chúng ta có thể xuất loài trên dòng thứ 9

169

Chương 7

Lập trình hướng đối tượng

Ghi đè phương thức Đôi khi khi sử dụng tính kế thừa, bạn muốn lớp con có thể thực hiện một hành động khác khi cùng một phương thức được gọi. Lấy phương thức makeSound của chúng ta từ lớp Animal đã tạo trước đó. Nó in ra tiếng “gầm”, nhưng đó không phải là âm thanh bạn muốn chó tạo ra khi bạn tạo lớp Dog của mình. Thay vào đó, chúng tôi sử dụng khái niệm ghi đè phương thức để thay đổi chức năng của phương thức. Trong lớp con, chúng tôi xác định lại phương thức [có cùng tên] để thực hiện tác vụ khác nhau. Python sẽ luôn sử dụng phương thức được xác định trong lớp con trước và nếu không tồn tại, thì nó sẽ kiểm tra lớp cha. Hãy sử dụng ghi đè phương thức để thay đổi phương thức makeSound và in câu lệnh thích hợp cho lớp Dog của chúng ta. 1. # các phương thức ghi đè được định nghĩa trong lớp cha 3. lớp Thú[ ]. 4. def makeSound[tự]. 5. in ["gầm"] 7. lớp Chó [Động vật]. số 8. def makeSound[tự]. 9. in ["vỏ cây"] 11. sam, lion = Dog[], Animal[ ] # khai báo nhiều biến trên một dòng 13. sam. makeSound[ ] Việc ghi đè # sẽ gọi phương thức makeSound trong Dog 15. sư tử. makeSound[ ] # không xảy ra ghi đè vì Animal không kế thừa bất cứ thứ gì Hãy tiếp tục và chạy ô. Ở dòng thứ 8 ta khai báo 2 instance sam và lion. Dòng tiếp theo là nơi chúng ta gọi phương thức makeSound từ ví dụ con chó của chúng ta về sam. Kết quả đầu ra là "bark" do ghi đè phương thức. Vì phương thức được kế thừa, nhưng sau đó được định nghĩa lại trong lớp Dog, thay vào đó, nó sẽ in ra tiếng sủa. Ở dòng thứ 10, chúng ta gọi phương thức tương tự với đối tượng Animal của chúng ta sư tử. Đầu ra này là "tiếng gầm" vì sư tử là một thể hiện của lớp Động vật. Hãy nhớ rằng kế thừa không hoạt động ngược. Các lớp con không thể cung cấp cho lớp cha bất kỳ tính năng nào

170

Chương 7

Lập trình hướng đối tượng

Kế thừa nhiều lớp Cho đến nay, chúng ta đã thấy cách chúng ta có thể kế thừa từ một lớp cha duy nhất. Bây giờ chúng ta sẽ thử kế thừa từ nhiều lớp. Sự khác biệt chính là cách bạn siêu các thuộc tính. Thay vì sử dụng phương thức super, bạn gọi trực tiếp tên lớp và truyền tham số self cùng với các thuộc tính. Hãy xem làm thế nào. 1. # cách kế thừa nhiều lớp 3. lớp Vật lý[ ]. 4. trọng lực = 9. 8 6. hạng Ôtô[ ]. 7. def __init__[bản thân, nhà sản xuất, mẫu mã, năm]. số 8. bản thân. làm, tự làm. người mẫu, bản thân. year = make, model, year # khai báo tất cả các thuộc tính trên một dòng 10. lớp Ford [Vật lý, Ô tô]. # có thể truy cập các thuộc tính và phương pháp Vật lý và Ô tô 11. def __init__[bản thân, người mẫu, năm]. 12. ô tô. __init__[self, "Ford", model, year] # super không hoạt động với nhiều số 14. xe tải = Ford["F-150", 2018] 16. in [xe tải. trọng lực, xe tải. make] # xuất cả hai thuộc tính Tiếp tục và chạy ô. Chúng tôi sẽ nhận được đầu ra là 9. 8 và “Ford”. Ở dòng 7, bạn sẽ nhận thấy rằng chúng ta kế thừa hai lớp trong dấu ngoặc đơn cho lớp Ford. Tuy nhiên, dòng thứ 9 là nơi phép thuật xảy ra lần này. Thay vì sử dụng super, chúng ta khởi tạo các biến bằng cách gọi trực tiếp tên của lớp kế thừa. Sử dụng phương thức init, chúng tôi truyền tham số self cùng với tất cả các thuộc tính mà Ô tô yêu cầu. Python biết siêu lớp nào sẽ sử dụng vì tên ở đầu dòng. Ở dòng cuối cùng, chúng tôi có thể thấy rằng chúng tôi có quyền truy cập vào cả hai thuộc tính được khai báo trong Vật lý và Ô tô, nơi chúng tôi đang kế thừa từ

171

Chương 7

Lập trình hướng đối tượng

THỨ NĂM BÀI TẬP 1. Người Tốt/Kẻ Xấu. Tạo ba lớp, một lớp cha gọi là “Nhân vật” sẽ được định nghĩa với các thuộc tính và phương thức sau. một. Thuộc tính. tên, đội, chiều cao, cân nặng b. phương pháp. sayHello Phương thức sayHello sẽ xuất ra câu lệnh “Xin chào, tên tôi là Max và tôi là người tốt”. Thuộc tính nhóm phải được khai báo thành một chuỗi "tốt" hoặc "xấu". ” Hai lớp khác, sẽ là các lớp con, sẽ là “GoodPlayers” và “BadPlayers. ” Cả hai lớp sẽ kế thừa “Nhân vật” và siêu tất cả các thuộc tính mà lớp cha yêu cầu. Các lớp con không cần bất kỳ phương thức hoặc thuộc tính nào khác. Khởi tạo một người chơi trong mỗi đội và gọi phương thức sayHello cho mỗi người. Đầu ra sẽ dẫn đến kết quả như sau. >>> "Xin chào, tôi tên là Max và tôi ủng hộ những người tốt" >>>> "Xin chào, tôi tên là Tony và tôi ủng hộ những kẻ xấu"

Hôm nay là tất cả về kế thừa trong OOP. Sử dụng tính kế thừa, chúng ta có thể cắt giảm các dòng lặp đi lặp lại mà chúng ta viết giữa các lớp tương tự. Các lớp kế thừa được gọi là lớp cha, trong khi những lớp thực hiện kế thừa được gọi là lớp con. Ngoài ra, khả năng ghi đè các phương thức được kế thừa được gọi là ghi đè phương thức và cung cấp tùy chỉnh lớp cho các lớp con

Thứ sáu. Tạo Blackjack Trong suốt tuần này, chúng ta đã học tất cả về cách sử dụng các lớp trong Python để cải thiện chương trình của mình. Hôm nay, chúng ta sẽ tập hợp tất cả những kiến thức đó lại với nhau và cùng nhau xây dựng trò chơi Blackjack nổi tiếng. Chúng tôi sẽ sử dụng các lớp trong suốt chương trình và bạn sẽ có thể thấy cách chúng tôi có thể cấu trúc một trò chơi hướng đối tượng chính thức trong Python. Người ta cho rằng bạn biết cách chơi Blackjack. Nếu không, hãy tra cứu các quy tắc và các bước về cách chơi. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_07” và thêm một ô đánh dấu ở dưới cùng có nội dung “Dự án Thứ Sáu. Tạo Blackjack. ” 172

Chương 7

Lập trình hướng đối tượng

Thiết kế cuối cùng Như với tất cả các dự án Thứ sáu trước đó, chúng tôi cần tạo ra một thiết kế cuối cùng mà chúng tôi có thể làm theo. Tuần này hơi khác một chút, vì chúng tôi cũng cần thiết kế các lớp học của mình trước. Điều này sẽ giúp chúng ta tìm ra những thuộc tính và phương thức mà các lớp của chúng ta cần phải có trước khi chúng ta bắt đầu lập trình. Bám sát kế hoạch chi tiết này sẽ cải thiện quá trình lập trình. Đầu tiên, hãy nghĩ xem chúng ta cần những lớp nào. Trong Blackjack, bạn có các quy tắc trò chơi cụ thể, hành động của trò chơi và bản thân bộ bài. Sau đó, chúng ta cũng cần xem xét rằng có một người chơi và một người chia bài đang chơi trò chơi. Có vẻ như chúng ta cần tạo hai lớp, một cho chính trò chơi và một cho hai người chơi. Bạn có thể lập luận rằng bạn cần có một hạng riêng cho người chia bài và người chơi; . Trước tiên hãy nghĩ xem lớp Trò chơi cần gì. •

Bộ bài Thuộc tính trò chơi – chứa tất cả 52 lá bài sẽ được sử dụng trong bộ trò chơi – được sử dụng để tạo bộ bài, bộ giá trị của tất cả bốn bộ phù hợp–– được sử dụng để tạo bộ bài, bộ giá trị của tất cả các giá trị lá bài

•

Phương thức trò chơi makeDeck – tạo bộ bài 52 lá mới khi được gọi là pullCard – bật thẻ ngẫu nhiên từ bộ bài và trả về

Lớp Trò chơi chủ yếu theo dõi bộ bài mà chúng ta đang chơi. Chúng tôi chắc chắn cũng có thể đặt tất cả các phương thức liên quan đến trò chơi bên trong lớp này; . Nếu bạn muốn cấu trúc lại trò chơi sau đó, vui lòng làm như vậy. Các phương thức như checkWinner, checkBust, handleTurn, v.v. , tất cả có thể là một phần của lớp Game. Đối với bài học này, chúng ta sẽ không lo lắng về việc thêm các phương thức này vào Game. Biết lớp Game sẽ xử lý những gì sẽ giúp chúng ta hiểu lớp Player của chúng ta cần gì. Hãy tiếp tục và lên kế hoạch cho các thuộc tính và phương thức cho lớp này ngay bây giờ. •

Player Attributes hand – lưu trữ các lá bài trong tên của người chơi – biến chuỗi lưu trữ tên của người chơi hoặc người chia bài

173

Chương 7

•

Lập trình hướng đối tượng

Phương pháp người chơi calcHand – trả về tổng số điểm đã tính được trong ván bài showHand – in ra ván bài của người chơi dưới dạng câu lệnh được định dạng đẹp mắt addCard – lấy một quân bài và thêm vào ván bài của người chơi

Như chúng ta có thể thấy, lớp Người chơi sẽ theo dõi ván bài của mỗi người chơi và bất kỳ phương thức nào liên quan đến việc thay đổi ván bài. Nói chung, bạn luôn muốn đặt các phương thức thay đổi một thuộc tính trong cùng một lớp mà thuộc tính được lưu trữ. Bây giờ chúng ta đã biết rõ về các thuộc tính và phương thức cần thiết cho mỗi lớp, chúng ta sẽ làm theo hướng dẫn này để lập trình trò chơi

Thiết lập Nhập Hãy bắt đầu viết chương trình này bằng cách nhập các hàm cần thiết mà chúng ta sẽ sử dụng. 1. # nhập các chức năng cần thiết 2. từ nhập ngẫu nhiên randint # cho phép chúng tôi lấy số 3 ngẫu nhiên. từ IPython. hiển thị nhập clear_output Vui lòng kiểm tra chức năng randint. Nó nhận hai đối số, tối thiểu và tối đa và sẽ trả về một số ngẫu nhiên giữa các đối số đó. Việc nhập khác mà chúng tôi cần là khả năng xóa đầu ra khỏi ô sổ ghi chép

Tạo Lớp trò chơi Tiếp theo, chúng ta sẽ bắt đầu viết lớp trò chơi chính mà chúng ta sẽ gọi là Blackjack. Nhìn vào thiết kế của chúng tôi mà chúng tôi đã tạo trước đây, chúng tôi sẽ cần khởi tạo lớp với các thuộc tính bộ bài, bộ quần áo và giá trị. 5. # tạo lớp blackjack, lớp này sẽ chứa tất cả các phương thức và thuộc tính của trò chơi 6. Blackjack đẳng cấp[ ]. 7. def __init__[bản thân]. số 8. bản thân. boong = [ ] # được đặt thành danh sách trống 9. bản thân. bộ quần áo = ["Bích", "Tim", "Kim cương", "Câu lạc bộ"] 10. bản thân. giá trị = [2, 3, 4, 5, 6, 7, 8, 9, 10, "J", "Q", "K", "A"] 174

Chương 7

Lập trình hướng đối tượng

Chúng tôi đặt thuộc tính bộ bài thành một danh sách trống vì chúng tôi sẽ tạo một phương thức tạo bộ bài cho chúng tôi. Hai thuộc tính khác được tạo dưới dạng bộ để chúng ta có thể lặp lại chúng mà không thay đổi các mục. Chúng tôi sẽ sử dụng chúng để tạo các thẻ cho bộ bài của chúng tôi

Tạo Bộ bài Sử dụng các chất và giá trị được xác định trong lớp Blackjack, chúng ta sẽ tạo bộ bài của mình. 12. # tạo phương thức tạo bộ bài 52 quân bài, mỗi quân bài phải là một bộ có giá trị và chất 13. def makeDeck[tự]. 14. cho phù hợp với bản thân. phù hợp với. 15. cho giá trị trong bản thân. giá trị. 16. bản thân. boong tàu. append[ [value, suit] ] # ex. [7, "Trái tim"] 18. trò chơi = Blackjack[ ] 19. trò chơi. makeDeck[] 20. in [trò chơi. deck]# xóa dòng này sau khi in ra chính xác Hãy tiếp tục và chạy ô. Phương thức makeDeck của chúng tôi đã tạo ra một bộ đầy đủ gồm 52 bộ, mỗi bộ có một giá trị trong chỉ mục 0 và một bộ trong chỉ mục 1. Chúng tôi đang lưu trữ từng thẻ dưới dạng một bộ vì chúng tôi không muốn vô tình thay đổi giá trị. Trong ba dòng cuối cùng, chúng tôi tạo một phiên bản của trò chơi, gọi phương thức makeDeck và xuất giá trị của thuộc tính bộ bài. Đảm bảo xóa dòng cuối cùng khi bạn hoàn tất, vì câu lệnh in chỉ được sử dụng cho mục đích gỡ lỗi

Rút Thẻ từ Bộ bài Bây giờ chúng ta đã tạo bộ bài, chúng ta có thể tạo một phương pháp để rút một lá bài từ bộ bài. Chúng tôi sẽ sử dụng phương pháp pop để có thể lấy một vật phẩm và loại bỏ nó khỏi bộ bài cùng một lúc

175

Chương 7

Lập trình hướng đối tượng

16. bản thân. boong tàu. append[ [giá trị, phù hợp] ] # ex. [7, "Trái tim"] ◽◽◽ 18. # phương pháp bật thẻ từ bộ bài bằng cách sử dụng giá trị chỉ số ngẫu nhiên 19. def pullCard[tự]. 20. tự trở về. boong tàu. pop[ randint[0, len[tự. bộ bài] – 1] ] 22. trò chơi = Blackjack[ ] 23. trò chơi. makeDeck[] 25. in [trò chơi. pullCard[], len[game. boong] ] # xóa dòng này sau khi nó in ra chính xác Hãy tiếp tục và chạy ô. Bạn sẽ nhận được kết quả như “[7, ‘Hearts’] 51”. Tuple là thẻ của chúng tôi mà chúng tôi đã in ra, trong khi 51 đang chứng minh cho chúng tôi thấy rằng nó đang loại bỏ một thẻ khỏi bộ bài. Chúng tôi thiết lập phương thức pullCard để nó bật một thẻ ngẫu nhiên từ bộ bài. Nó chọn ngẫu nhiên vì các đối số chúng tôi đã chuyển vào randint. Số tối đa mà chúng tôi muốn cho phép luôn nhỏ hơn một số so với kích thước của bộ bài vì việc lập chỉ mục bắt đầu từ 0. Nếu bộ bài còn lại 45 lá bài, chúng tôi muốn số nguyên ngẫu nhiên là từ 0 đến 44. Sau đó, nó bật mục trong chỉ mục ngẫu nhiên đó, xóa mục đó khỏi bộ bài và đưa mục đó trở lại nơi phương thức được gọi. Hiện tại, chúng tôi chỉ in nó ra, nhưng sau này chúng tôi sẽ thêm nó vào tay người chơi. Đảm bảo xóa dòng cuối cùng khi bạn hoàn tất, vì câu lệnh in chỉ được sử dụng cho mục đích gỡ lỗi

Tạo lớp người chơi Khi lớp trò chơi hoạt động bình thường, chúng tôi tập trung vào lớp người chơi. Hãy bắt đầu bằng cách tạo định nghĩa lớp để chấp nhận tên và đặt tay vào danh sách trống. 20. tự trở về. boong tàu. pop[ randint[0, len[tự. bộ bài] – 1] ] ◽◽◽ 22. # tạo lớp cho đối tượng người chia bài và người chơi 23. lớp người chơi[ ]. 24. def __init__[bản thân, tên]. 25. bản thân. tên = tên 26. bản thân. tay = [ ] 28. trò chơi = Blackjack[ ] 29. trò chơi. makeDeck[] 31. name = input["Tên bạn là gì?"] 176

Chương 7

Lập trình hướng đối tượng

32. người chơi = Người chơi [tên] 33. nhà cái = Người chơi["Nhà cái"] 34. in[máy nghe nhạc. tên, đại lý. tên] # xóa sau khi hoạt động chính xác Hãy tiếp tục và chạy ô. Chúng tôi sẽ nhận được một tuyên bố in tên đã được nhập, cũng như "Nhà cái". Chúng tôi xác định lớp người chơi sẽ được khởi tạo với tên và thuộc tính tay. Thuộc tính name được lấy làm đối số, trong khi hand được đặt trực tiếp bên trong lớp. Sau khi chúng tôi khởi tạo đối tượng trò chơi, chúng tôi hỏi người dùng tên của họ và tạo một thể hiện của lớp Người chơi với đầu vào của họ. Đối tượng đại lý sẽ luôn được gọi là "Đại lý", đó là lý do tại sao chúng tôi tạo phiên bản với giá trị đó được chuyển vào trong quá trình khởi tạo

Thêm quân bài vào Tay của Người chơi Sau khi chúng ta đã khởi tạo các đối tượng người chơi đúng cách, chúng ta có thể bắt đầu làm việc với các phương thức cần thiết cho lớp Người chơi. Khi tìm kiếm phương pháp nào để lập trình trước, bạn luôn cần suy nghĩ về phương pháp nào dựa trên các phương pháp khác. Đối với lớp này, các phương thức calcHand và showHand phụ thuộc vào việc có các quân bài trên tay. Vì lý do này, chúng tôi sẽ làm việc trên phương thức addCard và sau đó tập trung vào hai phương thức còn lại. 26. bản thân. tay = [ ] ◽◽◽ 28. # lấy một bộ và nối nó vào bàn tay 29. def addCard[bản thân, thẻ]. 30. bản thân. tay. nối thêm [thẻ] 32. trò chơi = Blackjack[ ] ◽◽◽ 37. người chia bài = Người chơi["Người chia bài"] ◽◽◽ 39. # thêm hai lá bài vào tay người chia bài và người chơi 40. cho tôi trong phạm vi [2]. 41. người chơi. thêm thẻ [trò chơi. pullCard[ ] ] 42. người buôn bán. thêm thẻ [trò chơi. pullCard[ ] ] 44. print["Tay người chơi. { } \nNgười chia bài. { }". định dạng [máy nghe nhạc. tay, đại lý. tay] ] # xóa sau

177

Chương 7

Lập trình hướng đối tượng

Đi trước và chạy tế bào. Chúng tôi sẽ nhận được kết quả là hai thẻ ngẫu nhiên trong tay của mỗi người chơi. Phương thức addCard chỉ cần lấy một bộ đại diện cho một lá bài và gắn nó vào tay người chơi. Ở dòng thứ 40, chúng tôi bắt đầu một vòng lặp for sẽ thêm hai thẻ vào mỗi tay. Nó thực hiện điều này bằng cách kéo một thẻ bằng cách sử dụng phương thức ví dụ trò chơi pullCard. Phương thức đó trả về một bộ và bộ đó sau đó được chuyển vào phương thức addCard, sau đó phương thức này được thêm vào tay của người chơi tương ứng. Vòng lặp này sẽ đủ để bắt đầu trò chơi trong đó tất cả người chơi bắt đầu với hai lá bài trên tay. Hãy chắc chắn xóa dòng cuối cùng, vì nó được sử dụng để gỡ lỗi

Hiển thị bài của một người chơi Trong phần trước, chúng tôi đã in toàn bộ bài của mỗi người chơi. Tuy nhiên, trong Blackjack thực tế, bạn chỉ đưa lá bài thứ hai được chia cho người chia bài. Việc tham chiếu trực tiếp thuộc tính cũng không tốt, vì vậy chúng ta sẽ cần tạo phương thức showHand để giải quyết cả hai vấn đề này. Chúng tôi sẽ sử dụng các câu lệnh in được định dạng độc đáo để hiển thị các ván bài, nhưng quan trọng hơn, chúng tôi sẽ đảm bảo rằng nếu vẫn đến lượt người chơi, thì bạn chỉ có thể nhìn thấy một trong các quân bài của người chia bài. 30. bản thân. tay. nối thêm[thẻ] ◽◽◽ 32. # nếu không đến lượt của người chia bài thì chỉ hiển thị một trong các thẻ của anh ta, nếu không thì hiển thị tất cả 33. def showHand[self, dealer_start = True]. 34. in[ "\n{ }". định dạng [tự. tên] ] 35. in["==========="] 37. cho tôi trong phạm vi[ len[self. tay] ]. 38. nếu tự. name == "Dealer" và i == 0 và dealer_start. 39. print["- of –"] # ẩn thẻ đầu tiên 40. khác. 41. thẻ = bản thân. tay[ tôi ] 42. in["{ } của { }". định dạng [thẻ[0], thẻ[1] ] ] 44. trò chơi = Blackjack[ ] ◽◽◽ 54. người buôn bán. thêm thẻ [trò chơi. pullCard[ ] ] ◽◽◽ 56. # giơ cả hai tay theo cách 57. người chơi. showHand[ ] 58. người buôn bán. showHand[ ]

178

Chương 7

Lập trình hướng đối tượng

Đi trước và chạy tế bào. Kết quả đầu ra trong tay của người chơi hiển thị cả hai thẻ, trong khi người chia bài chỉ hiển thị một. Hãy đi qua bước này từng bước. Ở dòng 33, chúng ta khai báo phương thức showHand với tham số dealer_start. Tham số này sẽ là một giá trị boolean theo dõi xem chúng ta có ẩn lá bài đầu tiên mà người chia bài được chia hay không. Chúng tôi đặt giá trị mặc định thành True để lần duy nhất chúng tôi cần chuyển đối số Sai vào phương thức là vào cuối khi chúng tôi muốn hiển thị thẻ của người chia bài. Vòng lặp for ở dòng 37 cho phép chúng ta in ra từng quân bài trên tay của đối tượng người chơi. Dòng 38 là nơi chúng tôi kiểm tra hai điều. 1. Ví dụ gọi phương thức này là người chia bài. 2. Chưa đến lượt người chia bài [dealer_start == True]. Nếu cả hai đều đúng, thì chúng tôi ẩn thẻ đầu tiên; . Biến thẻ được khai báo để dễ sử dụng khi đọc mã, vì chúng tôi đặt nó thành một trong các vật phẩm trong tay của chúng tôi, đại diện cho một thẻ. Sau đó, chúng tôi in một câu lệnh được định dạng với các giá trị của bộ dữ liệu. Điều này được thực hiện bằng cách truy cập chỉ mục 0 và 1 của các bộ đại diện cho mỗi thẻ. Ở dưới cùng của ô, chúng tôi gọi các phương thức này cho từng đối tượng người chơi

Tính Tổng số ván bài Bây giờ chúng ta có thể gọi một phương thức để hiển thị chính xác từng ván bài của người chơi, chúng ta cần tính tổng số quân bài trong ván bài. Tuy nhiên, phương pháp này trở nên hơi phức tạp vì chúng ta cần lưu ý một số kiểm tra. 1. Ách có thể có giá trị 11 hoặc 1 điểm. Chúng có giá trị 1 điểm nếu tổng số trên 21. 2. Nếu người chia bài chỉ đưa ra một quân bài, giá trị của bàn tay của anh ta chỉ nên đại diện cho giá trị của một quân bài đó mặc dù anh ta có hai quân bài trên tay. 3. Tất cả các thẻ mặt [J, Q, K] có giá trị 10 điểm. Có một số cách để xử lý phương pháp này. Những gì chúng ta sẽ lập trình cùng nhau chỉ là một trong nhiều cách đó. Khi suy nghĩ về cách tính quân Át, chúng ta cần kiểm tra giá trị của chúng sau khi đã tính tổng của tất cả các quân bài khác. Chúng tôi sẽ theo dõi xem chúng tôi có bao nhiêu con át đầu tiên và sau đó tính tổng chúng sau đó. Để đảm bảo rằng chúng tôi trả về đúng tổng số của người chia bài, chúng tôi sẽ theo dõi xem liệu có đến lượt của anh ấy hay không giống như chúng tôi đã làm trong phương thức showHand. Cuối cùng, để tính toán các giá trị thẻ mặt, chúng tôi sẽ tạo một từ điển các giá trị để lấy từ. 179

Chương 7

Lập trình hướng đối tượng

42. in["{ } của { }". định dạng[ thẻ[0], thẻ[1] ] ] ◽◽◽ 43. in["Tổng = { }". định dạng [tự. calcHand[dealer_start] ] ] 45. # nếu không đến lượt của người chia bài thì chỉ trả lại tổng điểm của quân bài thứ hai là 46. def calcHand[self, dealer_start = True]. 47. tổng = 0 48. quân át = 0 # tính quân át sau 49. card_values = {1. 1, 2. 2, 3. 3, 4. 4, 5. 5, 6. 6, 7. 7, 8. 8, 9. 9, 10. 10, "J". 10, "Q". 10, "K". 10, "A". 11} 51. nếu tự. name == "Dealer" và dealer_start. 52. thẻ = bản thân. tay[ 1 ] 53. trả về giá_trị_thẻ[ thẻ[ 0 ] ] 55. cho thẻ trong tự. tay. 56. nếu thẻ [ 0 ] == "A". 57. Át += 1 58. khác. 59. tổng += card_values[ card[ 0 ] ] 61. cho tôi trong phạm vi [át]. 62. nếu tổng + 11 > 21. 63. tổng cộng += 1 64. khác. 65. tổng cộng += 11 67. trả lại tổng cộng 69. trò chơi = Blackjack[ ] ◽◽◽ Tiếp tục và chạy ô. Bắt đầu từ dòng 46, chúng ta khai báo phương thức calcHand với tham số là dealer_start. Chúng tôi sẽ đặt tham số này thành giá trị mặc định là True, để nó mặc định chỉ hiển thị tổng một quân bài cho người chia bài. Dòng 47 là nơi chúng ta khai báo biến để theo dõi tổng. Dòng 48 là nơi chúng tôi khai báo biến của mình để theo dõi xem chúng tôi có bao nhiêu quân Át trong tay. Ở dòng 49, chúng tôi khai báo một từ điển các cặp khóa-giá trị đại diện cho giá trị của thẻ. Câu lệnh điều kiện của chúng ta ở dòng 51 kiểm tra xem đối tượng dealer có phải là đối tượng gọi phương thức này hay không, cũng như tham số dealer_start có đúng không. Nếu cả hai đều đúng, thì chúng tôi sẽ chỉ trả lại giá trị của thẻ thứ hai trong tay của người chia bài. Đó là quân bài thứ hai vì chúng ta đặt biến quân bài bằng với vật phẩm thứ hai trong ván bài, là quân bài thứ hai. Sau đó, chúng tôi tham chiếu từ điển giá trị thẻ với mục của biến thẻ trong chỉ mục 0. Mục này sẽ là một trong 180

Chương 7

Lập trình hướng đối tượng

các khóa và sau đó từ điển sẽ trả về giá trị của cặp khóa-giá trị đó. Nếu mục ở chỉ số 0 là “J”, từ điển sẽ trả về giá trị là 10. Vòng lặp for bắt đầu từ dòng 55 sẽ lặp qua từng quân bài trên tay của người chơi tương ứng, tham khảo từ điển để biết giá trị quân bài và thêm giá trị quân bài đó vào tổng hiện tại. Nếu quân bài là quân Át, nó sẽ chỉ thêm một quân Át vào biến quân Át của chúng ta và không thêm bất cứ thứ gì vào tổng số. Vòng for tiếp theo trên dòng 61 sẽ lặp bao nhiêu lần tùy theo số quân át trong tay của người chơi. Đối với mỗi ace, chúng tôi sẽ thêm 1 điểm hoặc 11 điểm tùy thuộc vào tổng số. Nếu thêm 11 điểm vào ván bài khiến tổng số lớn hơn 21, chúng ta chỉ cần thêm một điểm thay thế. Khi kết thúc phương thức, chúng tôi trả về tổng. Cuối cùng, dòng 43 là nơi chúng ta gọi calcHand trong phương thức showHand. Chúng tôi chuyển biến dealer_start trong trường hợp chúng tôi đang cố gắng hiển thị ván bài trong lượt của người chia bài. Sau đó, trong lượt của người chia bài, chúng tôi sẽ chuyển đối số Sai, sau đó sẽ tính tổng tất cả các thẻ của người chia bài thay vì chỉ một

Xử lý Lượt của người chơi Các định nghĩa lớp hiện đã hoàn tất. Chúng ta có thể bắt đầu tập trung vào dòng chảy trò chơi chính. Đầu tiên, chúng ta sẽ xử lý lượt của người chơi. Họ nên có khả năng đánh hoặc ở lại. Nếu họ ở lại, lượt của họ kết thúc. Nếu họ đánh, thì chúng ta cần rút một quân bài từ bộ bài và thêm vào tay họ. Sau khi thẻ được thêm vào, chúng tôi sẽ phải kiểm tra xem người chơi có vượt quá 21 không. Nếu họ làm vậy, họ sẽ thua và chúng tôi sẽ cần theo dõi điều đó để xác định đầu ra sau này. 83. người buôn bán. showHand[ ] ◽◽◽ 85. player_bust = False # biến để theo dõi người chơi trên 21 87. while input["Bạn muốn ở lại hay đánh?"]. thấp hơn[ ]. = "ở lại". 88. Clear_output[] 90. # rút thẻ và đặt vào tay người chơi 91. người chơi. thêm thẻ [trò chơi. pullCard[ ] ] 93. # giơ cả hai tay theo phương pháp 94. người chơi. showHand[ ] 95. người buôn bán. showHand[ ] 97. # kiểm tra nếu trên 21 98. nếu người chơi. calcHand[] > 21. 99. player_bust = True # người chơi bị phá, theo dõi 100 sau. print["Bạn thua. "] # xóa sau khi chạy đúng 101. break # thoát ra khỏi vòng lặp của người chơi 181

Chương 7

Lập trình hướng đối tượng

Đi trước và chạy tế bào. Hiện tại, hãy thử đánh cho đến khi bạn vượt qua 21. Điều này sẽ gây ra kết quả là “Bạn thua. ”. Sẽ không có gì xảy ra nếu bạn không vượt quá 21 tuổi, vì chúng tôi chưa xử lý vấn đề đó, nhưng chúng tôi sẽ đến đó. Ở dòng 85 ta khai báo một biến để theo dõi người chơi đi quá 21. Sau đó, chúng tôi bắt đầu vòng lặp while của mình bằng cách hỏi người dùng xem họ muốn đánh hay ở lại. Nếu họ chọn bất cứ điều gì nhưng ở lại, thì vòng lặp sẽ chạy. Trong vòng lặp, chúng tôi sẽ xóa đầu ra, thêm thẻ vào tay người chơi, đưa tay ra và sau đó kiểm tra xem họ có bị phá sản không. Có hai cách để vòng lặp kết thúc, họ phá sản hoặc họ chọn ở lại

Xử lý Lượt của người chia bài Lượt của người chia bài sẽ rất giống với lượt của người chơi, nhưng chúng tôi sẽ không cần hỏi liệu người chia bài có muốn đánh hay không. Nhà cái tự động đánh khi chưa đủ 17 tuổi. Mặc dù vậy, chúng tôi sẽ cần theo dõi xem đại lý có phá sản hay không. 100. break # thoát ra khỏi vòng lặp của người chơi ◽◽◽ 102. # xử lý lượt của người chia bài, chỉ chạy nếu người chơi không phá 103. dealer_bust = Sai 105. nếu không player_bust. 106. while dealer. calcHand[Sai] < 17. # vượt qua Sai để tính tất cả các thẻ 107. # rút thẻ và đưa vào tay người chơi 108. người buôn bán. thêm thẻ [trò chơi. pullCard[ ] ] 110. # kiểm tra nếu trên 21 111. nếu đại lý. calcHand[Sai] > 21. # vượt qua Sai để tính tất cả các thẻ 112. đại lý_bust = Đúng 113. print["Bạn thắng. "] # xóa sau khi chạy đúng 114. break # thoát ra khỏi vòng lặp của người chia bài Hãy tiếp tục và chạy ô. Hãy thử chạy ô cho đến khi bạn nhận được đại lý vượt quá 21, dẫn đến câu lệnh in đang chạy. Chúng tôi bắt đầu bằng cách khai báo một biến trên dòng 103 để theo dõi việc đại lý sắp phá sản. Ở dòng 105, chúng tôi kiểm tra xem người chơi đã bị phá bài chưa, vì vòng chơi đã kết thúc và người chia bài không cần rút bài. Dòng 106 là nơi vòng lặp của chúng tôi bắt đầu, dòng này sẽ thêm một quân bài vào tay của người chia bài và kiểm tra xem anh ta có bị đánh bài không. Vòng lặp sẽ tiếp tục cho đến khi người chia bài có nhiều hơn 16 điểm hoặc anh ta vượt quá 21. 182

Chương 7

Lập trình hướng đối tượng

Khi chúng tôi gọi phương thức calcHand cho đại lý lần này, chúng tôi chuyển đối số Sai. Điều này là để phương pháp này sẽ tính toán tổng cộng của ván bài chứ không chỉ lá bài thứ hai, như chúng ta đã làm trước đây

Tính toán người chiến thắng Phần cuối cùng của trò chơi này là tính xem ai là người chiến thắng. Cho đến nay, chúng tôi đã thực hiện một số kiểm tra để xem liệu một trong hai người chơi đã thua khi vượt quá 21 hay chưa. Trước tiên, chúng tôi sẽ kiểm tra xem người chơi có bị bắt không, sau đó là người chia bài. Nếu không có người chơi nào phá sản, thì chúng ta sẽ cần xem ai có tổng điểm cao hơn. Nếu họ hòa, thì đó được gọi là đẩy và không ai thắng. 113. break # thoát ra khỏi vòng lặp của người chia bài ◽◽◽ 115. clear_output[ ] 117. # giơ cả hai tay theo phương pháp 118. người chơi. showHand[ ] 119. người buôn bán. showHand[False] # vượt qua Sai để tính toán và hiển thị tất cả các thẻ, ngay cả khi có 2 121. # tính người chiến thắng 122. nếu player_bust. 123. print["Bạn đã phá sản, chúc bạn may mắn lần sau. "] 124. đại lý yêu tinh_bust. 125. print["Nhà cái bị phá, bạn thắng. "] 126. đại lý yêu tinh. calcHand[Sai] > trình phát. calcHand[]. 127. print["Nhà cái có bài cao hơn, bạn thua. "] 128. đại lý yêu tinh. calcHand[Sai] < người chơi. calcHand[]. 129. print["Bạn thắng nhà cái. chúc mừng. "] 130. khác. 131. print["Bạn đã đẩy, không ai thắng. "] Hãy tiếp tục và chạy ô. Bây giờ chúng tôi có một trò chơi Blackjack đầy đủ chức năng. Để bắt đầu, chúng tôi xóa đầu ra và hiển thị tay của cả hai người chơi. Tuy nhiên, sự khác biệt chính là ở dòng 119. Chúng tôi chuyển đối số Sai vào phương thức showHand cho đại lý. Điều này là để tất cả các thẻ của đại lý hiển thị, cùng với tổng số đầy đủ. Hãy nhớ rằng chúng ta đang gọi phương thức calcHand trong showHand và chuyển giá trị của dealer_ start, mà chúng ta đã đặt thành Sai với lệnh gọi phương thức này. Sau đó, chúng tôi thiết lập một vài điều kiện sẽ đưa ra kết quả phù hợp dựa trên điều kiện đã cho. 183

Chương 7

Lập trình hướng đối tượng

Kết quả cuối cùng Chúc mừng bạn đã hoàn thành dự án này. Do quy mô của dự án, bạn có thể tìm thấy phiên bản hoàn chỉnh của mã trên Github. Để tìm mã cụ thể cho dự án này, chỉ cần mở hoặc tải xuống “Week_07. tập tin ipynb”. Nếu bạn gặp lỗi trong quá trình thực hiện, hãy đảm bảo tham chiếu chéo mã của bạn với mã trong tệp này và xem bạn có thể đã sai ở đâu

Mặc dù dự án hôm nay kéo dài nhưng chúng ta đã có thể thấy một số ví dụ tuyệt vời về lập trình hướng đối tượng. Việc sử dụng các lớp mang lại cho chúng tôi khả năng sử dụng lại một số dòng mã giống như chúng tôi đã làm cho các đối tượng người chơi và người chia bài. Chương trình này chắc chắn có thể được cấu trúc lại để có nhiều phương thức hơn trong lớp Blackjack; . Vì lý do này, tôi giữ cho các lớp ngắn hơn và chức năng trò chơi chính tách biệt. Hãy chắc chắn kiểm tra trò chơi và thêm các tính năng của riêng bạn vào trò chơi nếu bạn muốn

Tóm tắt hàng tuần Trong suốt tuần này, chúng ta đã đề cập đến các khái niệm về lập trình hướng đối tượng và tại sao chúng lại quan trọng trong thế giới lập trình. Trong Python, chúng ta biết chúng là các lớp. Chúng cho phép chúng tôi sử dụng lại mã và tạo nhiều phiên bản từ một đối tượng. Khi lưu trữ biến hoặc tạo hàm bên trong lớp, chúng được gọi là thuộc tính và phương thức. Chúng tôi có thể tham chiếu những thứ này bằng cách sử dụng cú pháp dấu chấm và tham số self. Nếu không có các lớp, chúng ta sẽ cần mã hóa cứng từng dòng cho tất cả các đối tượng trong chương trình của mình. Điều này trở nên đặc biệt rõ ràng trong các chương trình quy mô lớn hơn. Để tăng khả năng sử dụng lại mã, chúng tôi có thể sử dụng tính kế thừa. Điều này cho phép các lớp con kế thừa các thuộc tính và phương thức từ các lớp cha, giống như lớp cha và con của chúng. Vào cuối tuần này, chúng tôi đã có thể tạo một trò chơi Blackjack hướng đối tượng. Điều này cho thấy các khả năng của OOP, vì chúng tôi có thể tạo nhiều phiên bản của đối tượng trình phát. Trong tương lai, hãy nhớ coi thế giới xung quanh bạn là đồ vật. Nó sẽ giúp bạn thích nghi với thế giới của OOP và hiểu các thuộc tính và phương thức của đối tượng là gì

184

Chương 7

Lập trình hướng đối tượng

Lời giải của câu hỏi thử thách Lời giải của câu hỏi thử thách là 10. Lý do đằng sau kết quả này là do cách hoạt động của từ điển. Hãy nhớ rằng khi truy cập thông tin từ từ điển, bạn có thể truy cập các cặp khóa-giá trị. Khi truy cập một khóa từ từ điển, bạn sẽ nhận lại giá trị của cặp khóa-giá trị đó. Dòng sau đang truy cập giá trị của mục đầu tiên trong biến thẻ. >>> thẻ[0] Điều này sẽ dẫn đến “Q”, vì nó là mục đầu tiên trong bộ được gán vào thẻ. Khi chúng tôi truy cập từ điển, chúng tôi đang truy cập giá trị của phím “Q”. Dòng cuối cùng sẽ trông như thế này. >>> in["{ }". format[values["Q"]]] Điều này sau đó sẽ xuất giá trị của “Q. Cặp khóa-giá trị 10”, tức là 10

Thử thách hàng tuần Để kiểm tra kỹ năng của bạn, hãy thử những thử thách này. 1. Gameloop. Sử dụng mã từ dự án Friday của chúng tôi, tạo vòng lặp trò chơi để bạn có thể liên tục chơi một ván bài mới cho đến khi người chơi quyết định bỏ cuộc. Ô chỉ dừng chạy nếu người chơi gõ “thoát”; . 2. Thêm tiền tệ. Sử dụng mã từ dự án Thứ Sáu của chúng tôi, thêm khả năng đặt cược tiền tệ trong trò chơi. Đảm bảo theo dõi đơn vị tiền tệ trong lớp Người chơi, vì thuộc tính phải thuộc về đối tượng đó. Trước mỗi ván bài, hãy hỏi người dùng xem họ muốn đặt cược bao nhiêu;

185

CHƯƠNG 8

Chủ đề nâng cao I. Hiệu quả Bây giờ chúng ta đã có cơ sở vững chắc để làm việc, chúng ta có thể bắt đầu đi sâu vào các chủ đề nâng cao hơn. Trong hai tuần tới, chúng tôi sẽ đề cập đến các khái niệm giúp giảm số lượng mã bạn cần viết. Nhiều khái niệm trong số này sẽ giúp chúng ta chuẩn bị cho việc phân tích dữ liệu trong Tuần 10. Trong suốt tuần này, chúng tôi sẽ đề cập đến một lớp lót bằng cách sử dụng chức năng hiểu danh sách và ẩn danh. Điều này sẽ giúp giảm bớt các dòng mã bằng cách cô đọng chức năng tương tự trong một dòng duy nhất. Sau đó, chúng tôi sẽ đề cập đến một số hàm Python tích hợp giúp làm việc với dữ liệu dễ dàng hơn. Khái niệm cuối cùng chúng ta đề cập là khi các hàm gọi chính chúng, được gọi là hàm đệ quy. Thông thường, các loại chức năng này thiếu hiệu quả, vì vậy chúng tôi sẽ đề cập đến cách sử dụng khái niệm bộ nhớ đệm được gọi là ghi nhớ. Vì tuần này là tất cả về các chủ đề nâng cao, chúng ta sẽ đi sâu vào một trong những thuật toán quan trọng hơn trong lập trình… Tìm kiếm nhị phân. Chúng ta sẽ xem cách lập trình từng dòng thuật toán này và hiểu cách các thuật toán tìm kiếm có thể hoạt động hiệu quả. Tổng quan •

Xây dựng danh sách trong một dòng bằng cách hiểu

•

Hiểu các chức năng ẩn danh một dòng

•

Sử dụng các chức năng tích hợp sẵn của Python để thay đổi danh sách

•

Hiểu các hàm đệ quy và cách cải thiện chúng

•

Viết thuật toán tìm kiếm nhị phân

187

Chương 8

Chủ đề nâng cao I. Hiệu quả

CÂU HỎI THỬ THÁCH Đối với thử thách của tuần này, tôi muốn bạn tạo một chương trình yêu cầu người dùng nhập một số và cho người dùng đó biết số họ đã nhập có phải là số nguyên tố hay không. Hãy nhớ rằng các số nguyên tố chỉ chia hết cho một và chính nó và phải ở trên số 2. Tạo một hàm có tên là “isPrime” mà bạn chuyển đầu vào và trả về giá trị Đúng hoặc Sai. Đảm bảo ghi nhớ tính hiệu quả khi lập trình chức năng

Thứ hai. Khả năng hiểu danh sách Khả năng hiểu danh sách cho phép chúng ta tạo một danh sách chứa đầy dữ liệu trong một dòng. Thay vì tạo một danh sách trống, lặp lại một số dữ liệu và nối tất cả dữ liệu đó vào danh sách trên các dòng riêng biệt, chúng ta có thể sử dụng khả năng hiểu để thực hiện tất cả các bước này cùng một lúc. Nó không cải thiện hiệu suất, nhưng nó sạch hơn và giúp giảm các dòng mã trong chương trình của chúng tôi. Với sự hiểu biết, chúng ta có thể giảm hai hoặc nhiều dòng thành một. Ngoài ra, nó thường nhanh hơn để viết. Để theo dõi nội dung của ngày hôm nay, hãy mở Jupyter Notebook từ thư mục “python_bootcamp” của chúng tôi. Sau khi mở, hãy tạo một tệp mới và đổi tên thành “Week_08. ” Tiếp theo, tạo ô đánh dấu đầu tiên có tiêu đề cho biết. “Danh sách hiểu. ” Chúng tôi sẽ bắt đầu làm việc bên dưới tế bào đó

Cú pháp hiểu danh sách Cú pháp khi sử dụng hiểu danh sách phụ thuộc vào những gì bạn đang cố gắng viết. Cấu trúc cú pháp chung để hiểu danh sách trông giống như sau. >>> *result* = [ *transform* *iteration* *filter* ] Ví dụ: khi bạn muốn điền vào một danh sách, cú pháp sẽ có cấu trúc như sau. >>> name_of_list = [ item_to_append for item in list ] Tuy nhiên, khi bạn muốn đưa vào câu lệnh if thì cách hiểu sẽ như sau. >>> name_of_list = [ item_to_append cho mục trong danh sách nếu có điều kiện ] 188

Chương 8

Chủ đề nâng cao I. Hiệu quả

Mục sẽ chỉ được thêm vào danh sách mới nếu điều kiện được đáp ứng; . Cuối cùng, nếu bạn muốn bao gồm một điều kiện khác, nó sẽ giống như sau. >>> name_of_list = [ item_to_append nếu điều kiện khác item_to_append cho mục trong danh sách ] Khi sử dụng điều kiện khác trong khả năng hiểu danh sách, mục đầu tiên sẽ chỉ được thêm vào danh sách khi câu lệnh if chứng minh là Đúng. Nếu nó là Sai, thì mục xuất hiện sau câu lệnh khác sẽ được thêm vào danh sách

Tạo Danh sách Số Hãy thử tạo danh sách các số từ 0 cho đến 100 bằng cách sử dụng khả năng hiểu danh sách. # tạo danh sách mười số bằng cách sử dụng hiểu danh sách nums = [ x for x in range[100] ] # tạo danh sách từ 0 đến 100 print[nums] Tiếp tục và chạy ô. Bạn sẽ nhận thấy rằng chúng tôi xuất ra một danh sách bao gồm 100 số. Khả năng hiểu danh sách đã cho phép chúng tôi xây dựng danh sách này trong một dòng thay vì viết vòng lặp for và câu lệnh nối thêm trên các dòng riêng biệt. Sự hiểu biết từ ô trước là một đại diện chính xác của đoạn mã sau. >>> nums = [ ] >>> cho x trong phạm vi [100]. >>> số. append[x] Như bạn có thể thấy, chúng tôi đã giảm ba dòng xuống còn một dòng bằng cách sử dụng khả năng hiểu. Điều này không cải thiện hiệu suất nhưng làm giảm số lượng dòng trong mã của chúng tôi. Nó trở nên rõ ràng hơn trong các chương trình lớn hơn và tôi thực sự khuyên bạn nên cố gắng sử dụng khả năng hiểu khi có thể. Trong tương lai, chúng tôi sẽ bắt đầu sử dụng khả năng hiểu danh sách khi xây dựng danh sách

189

Chương 8

Chủ đề nâng cao I. Hiệu quả

Các câu lệnh if Trước đó, chúng ta đã tìm hiểu cách cú pháp thay đổi khi đưa câu lệnh if vào phần hiểu của bạn. Hãy thử một ví dụ bằng cách tạo một danh sách chỉ các số chẵn. # sử dụng câu lệnh if trong phạm vi hiểu danh sách nums = [ x for x in range[10] if x % 2 == 0 ] # g tạo danh sách các số chẵn lên đến 10 print[nums] Hãy tiếp tục và chạy ô. Đối với cách hiểu này, biến x chỉ được thêm vào danh sách khi điều kiện chứng minh là Đúng. Trong trường hợp của chúng tôi, điều kiện là Đúng khi giá trị hiện tại của x chia hết cho hai. Trong phần sau, bạn sẽ tìm thấy cùng một mã cần thiết mà không cần sử dụng khả năng hiểu. >>> nums = [ ] >>> cho x trong phạm vi [10]. >>> if x % 2 == 0. >>> nums. append[x] Lần này chúng tôi có thể giảm bốn dòng mã xuống còn một. Điều này thường có thể cải thiện khả năng đọc mã của bạn

Câu lệnh If-Else Bây giờ chúng ta hãy tiến thêm một bước nữa và thêm vào câu lệnh other. Lần này chúng ta sẽ nối thêm chuỗi “Chẵn” khi số đó chia hết cho hai; . # sử dụng câu lệnh if/else trong phạm vi hiểu danh sách nums = [ "Even" if x % 2 == 0 other "Odd" for x in range[10] ] # g tạo danh sách chuỗi chẵn/lẻ print[nums]

190

Chương 8

Chủ đề nâng cao I. Hiệu quả

Đi trước và chạy tế bào. Điều này sẽ xuất ra một danh sách các chuỗi đại diện cho các số lẻ hoặc giá trị chẵn. Ở đây chúng ta nối thêm chuỗi “Even” khi điều kiện if là True; . Có thể tìm thấy cùng một biểu diễn mã mà không cần hiểu trong phần sau. >>> nums = [ ] >>> cho x trong phạm vi [10]. >>> if x % 2 == 0. >>> nums. append["Even"] >>> else. >>> nums. append["Odd"] Chúng tôi đã giảm số dòng mã từ sáu xuống còn một. Khả năng hiểu là tuyệt vời để tạo dữ liệu nhanh chóng; . Khả năng hiểu không cho phép sử dụng câu lệnh elif, chỉ sử dụng câu lệnh if/else

Khả năng hiểu danh sách với Tính năng hiểu biến cũng rất tốt để tạo dữ liệu từ các danh sách khác. Hãy lấy một danh sách các số và tạo một danh sách riêng gồm các số đó đã bình phương, sử dụng khả năng hiểu. # tạo danh sách các số bình phương từ một danh sách số khác bằng cách sử dụng hiểu danh sách nums = [2, 4, 6, 8] squared_nums = [ num**2 for num in nums ] # tạo danh sách các số bình phương mới dựa trên nums . Chúng ta sẽ nhận được kết quả là [ 4, 16, 36, 64 ]. Đối với ví dụ này, chúng tôi có thể tạo các số bình phương bằng cách nối thêm biểu thức “num∗∗2”. Biểu diễn tương tự của mã mà không cần hiểu sẽ giống như sau. >>> squared_nums = [ ] >>> cho num trong nums. >>> squared_nums. append[num**2] Trong ví dụ này, chúng tôi có thể giảm các dòng cần thiết từ ba xuống còn một. 191

Chương 8

Chủ đề nâng cao I. Hiệu quả

Hiểu từ điển Bạn không chỉ có thể sử dụng khả năng hiểu trên danh sách mà còn cả từ điển Python. Cấu trúc cú pháp hoàn toàn giống nhau, ngoại trừ bạn cần bao gồm một cặp khóa-giá trị thay vì một số để chèn vào từ điển. Hãy tạo một từ điển gồm các số chẵn làm khóa, trong đó giá trị là bình phương của khóa. # tạo từ điển các số chẵn và giá trị bình phương bằng cách sử dụng số hiểu = [ x for x in range[10] ] squares = { num. num**2 cho num trong số nếu num % 2 == 0 } print[squares] Tiếp tục và chạy ô. Chúng tôi sẽ nhận được những điều sau đây. “{0. 0, 2. 4, 4. 16, 6. 36, 8. 64}”. Chúng tôi có thể thêm từng cặp khóa-giá trị bằng cách sử dụng khả năng hiểu trong khi kiểm tra xem liệu chúng có phải là số chẵn với câu lệnh điều kiện hay không

THỨ HAI BÀI TẬP 1. chuyển đổi bằng cấp. Sử dụng khả năng hiểu danh sách, chuyển đổi danh sách sau thành Fahrenheit. Hiện tại, độ ở nhiệt độ độ C. Công thức quy đổi là “[9/5] * C + 32”. Đầu ra của bạn phải là [ 53. 6, 69. 8, 59, 89. 6 ]. >>> độ = [ 12, 21, 15, 32 ]

2. Đầu vào của người dùng. Yêu cầu người dùng nhập một số nguyên tối đa và bao gồm 100. Tạo danh sách các số chia hết cho số đó tối đa và bao gồm 100 bằng cách sử dụng tính năng hiểu danh sách. Ví dụ: nếu số 25 được nhập, thì đầu ra phải là [ 25, 50, 75, 100 ]

Trọng tâm của ngày hôm nay là tất cả về việc tạo danh sách bằng cách sử dụng một khái niệm gọi là hiểu danh sách. Tùy thuộc vào biểu thức cần thiết, bạn sẽ sử dụng một cấu trúc cú pháp nhất định. Hiểu không cải thiện hiệu suất; . Nó cũng có thể cải thiện khả năng đọc

192

Chương 8

Chủ đề nâng cao I. Hiệu quả

Thứ ba. Hàm Lambda Hàm lambda, còn được gọi là hàm ẩn danh, là hàm một dòng trong Python. Giống như khả năng hiểu danh sách, các hàm lambda cho phép chúng ta giảm bớt các dòng mã cần viết trong chương trình của mình. Nó không hoạt động đối với các chức năng phức tạp nhưng giúp cải thiện khả năng đọc của các chức năng nhỏ hơn. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ ghi chép trước của chúng ta “Week_08” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Hàm Lambda. ”

Cú pháp của hàm lambda Cú pháp của các hàm lambda nhìn chung sẽ giữ nguyên, không giống như cách hiểu danh sách khi bạn bắt đầu thêm các câu điều kiện. Để bắt đầu, hãy nhìn vào cấu trúc cơ bản. >>> đối số lambda. biểu thức Lambdas sẽ luôn bắt đầu bằng từ khóa lambda. Sau đó, bạn sẽ tìm thấy bất kỳ đối số nào đang được chuyển vào. Ở bên phải của dấu hai chấm, chúng ta sẽ thấy biểu thức được thực hiện và trả về. Lambdas trả về biểu thức theo mặc định, vì vậy chúng tôi không cần sử dụng từ khóa. >>> đối số lambda. value_to_return nếu điều kiện khác value_to_return Giống như cách hiểu danh sách, câu lệnh điều kiện sẽ ở cuối. Điều này phức tạp như các hàm lambda nhận được. Bất cứ điều gì nhiều hơn thế này sẽ yêu cầu viết hoàn toàn chức năng

Ghi chú

Về cơ bản, Lambdas sử dụng các toán tử bậc ba ở bên phải của dấu hai chấm

Sử dụng Lambda Khi sử dụng lambda mà không lưu trữ chúng vào một biến, bạn cần đặt dấu ngoặc đơn xung quanh hàm, cũng như bất kỳ đối số nào được truyền vào. Hãy bắt đầu từ việc nhỏ bằng cách viết một hàm lambda sẽ trả về kết quả của đối số bình phương

193

Chương 8

Chủ đề nâng cao I. Hiệu quả

# sử dụng lambda để bình phương một số [ lambda x. x**2 ][ 4 ] # lấy 4 và trả về số bình phương Hãy tiếp tục và chạy ô. Chúng tôi sẽ nhận được đầu ra là 16. Bộ dấu ngoặc đơn đầu tiên chứa hàm lambda. Bộ thứ hai giữ đối số được truyền vào. Trong trường hợp này, số nguyên 4 được truyền vào x và biểu thức x∗∗2 được thực hiện và kết quả trả về. Chúng được gọi là các hàm ẩn danh vì chúng không có tên. Trong phần sau, bạn sẽ tìm thấy mã được viết cho một chức năng thông thường sẽ thực hiện cùng một thao tác thực thi. >>> def bình phương[x]. >>> return x**2 >>> square[4] Chúng tôi đã lấy ba dòng và biến chúng thành một. Khi bạn đã quen với việc đọc cú pháp lambda, các chương trình sẽ trở nên dễ đọc và dễ viết hơn với các hàm này

Truyền nhiều đối số Lambdas có thể nhận bất kỳ số lượng đối số nào, chẳng hạn như các hàm. Lần này, hãy thử chuyển hai đối số và nhân chúng với nhau. # truyền nhiều đối số vào lambda [ lambda x, y. x * y ][ 10, 5 ] # x = 10, y = 5 và trả về kết quả là 5 * 10 Hãy tiếp tục và chạy ô. Chúng tôi sẽ nhận được đầu ra là 50. Lần này, hàm lambda chấp nhận hai đối số của x và y ở phía bên trái của dấu hai chấm. Ở bên phải dấu hai chấm đã thực hiện được biểu thức nhân hai đối số đó với nhau và trả về kết quả. Trong phần sau, bạn sẽ tìm thấy cùng một đoạn mã, như thể chúng ta đã viết một hàm bình thường. >>> def nhân [x, y]. >>> return x * y >>> multipplication[10, 5] Tương tự như trước đây, chúng tôi có thể lưu một vài dòng mã để nhận được kết quả tương tự. 194

Chương 8

Chủ đề nâng cao I. Hiệu quả

Lưu các hàm Lambda Lambdas nhận được tên hàm ẩn danh ở đó vì chúng không có tên để tham chiếu hoặc gọi khi. Sau khi một hàm lambda được sử dụng, nó không thể được sử dụng lại trừ khi nó được lưu vào một biến. Hãy sử dụng hàm lambda giống như trước đây, ngoại trừ lần này hãy lưu nó vào một biến có tên là "hình vuông" có thể được tham chiếu ngay cả sau khi hàm lambda được đọc. # lưu hàm lambda vào một biến square = lambda x, y. x * y print[square] result = square[10, 5] # gọi hàm lambda được lưu trữ trong biến square và trả về 5 * 10 print[result] Tiếp tục và chạy ô. Chúng tôi sẽ nhận được đầu ra giống như trước đây, ngoại trừ lần này chúng tôi đã nhận được nó bằng cách gọi hình vuông là một hàm. Khi các hàm được lưu trữ bên trong các biến, tên biến đóng vai trò gọi hàm. Khi chúng tôi lưu trữ một lambda bên trong biến hình vuông, chúng tôi có thể gọi hàm lambda bằng cách gọi hình vuông và chuyển vào các đối số

Lưu ý Ngay cả các hàm được định nghĩa bình thường cũng có thể được lưu vào các biến và được tham chiếu bởi tên biến

Câu lệnh có điều kiện Sau khi bạn bắt đầu thêm các câu lệnh có điều kiện vào hàm lambda, chúng sẽ hoạt động giống như cách mà các toán tử bậc ba thực hiện. Sự khác biệt duy nhất là bạn phải cung cấp cả câu lệnh if và other. Bạn không thể chỉ sử dụng một câu lệnh if; . Hãy tạo một lambda sẽ trả về số lớn hơn giữa hai đối số được truyền vào. # sử dụng các câu lệnh if/else bên trong lambda để trả về số lớn hơn lớn hơn = lambda x, y. x nếu x > y khác y kết quả = lớn hơn[5, 10] in[kết quả] 195

Chương 8

Chủ đề nâng cao I. Hiệu quả

Đi trước và chạy tế bào. Chúng tôi sẽ nhận được đầu ra là 10 vì đó là giá trị cao hơn. Lambdas cực kỳ hữu ích khi bạn cần một hàm có thể thực hiện một điều kiện đơn giản như thế này. Mã tương tự được viết như một chức năng bình thường có thể được nhìn thấy trong phần sau. >>> def lớn hơn[x, y]. >>> nếu x > y. >>> return x >>> else. >>> return y >>> result = better[5, 10] Khi sử dụng các câu điều kiện, thật dễ dàng nhận thấy sức mạnh của các hàm lambda. Trong trường hợp này, chúng tôi có thể biến năm dòng mã thành một

Trả về một Lambda Điểm nổi bật của hàm lambda nằm ở khả năng làm cho các hàm khác trở nên mô-đun hơn. Giả sử chúng ta có một hàm nhận một đối số và chúng ta muốn đối số đó được nhân với một số chưa biết sau này trong chương trình. Chúng ta có thể chỉ cần tạo một biến lưu trữ hàm lambda được trả về trong khi truyền đối số. Hãy thử một vài ví dụ. # trả về một hàm lambda từ một hàm khác def my_func[n]. trả lại lambda x. x * n doubler = my_func[2] # trả về tương đương với lambda x. x * 2 print[ doubler[5] ] # sẽ xuất 10 triple = my_func[3] # trả về giá trị tương đương lambda x. x * 3 print[ triple[5] ] # sẽ xuất ra 15 Hãy tiếp tục và chạy ô. Chúng tôi sẽ nhận được đầu ra là 10 và 15. Điều gì xảy ra khi chúng tôi xác định biến nhân đôi của mình là chúng tôi gọi my_func trong khi chuyển vào giá trị số nguyên 2. Giá trị đó được sử dụng trong hàm lambda và sau đó lambda được trả về. Tuy nhiên, lambda không được trả về dưới dạng “lambda x. x ∗ n”; . Bất cứ khi nào bộ nhân đôi được gọi, đó thực sự là hàm lambda được gọi. Đó là lý do tại sao chúng tôi nhận được đầu ra là 10 khi chuyển giá trị 5 vào bộ nhân đôi. Điều tương tự cũng áp dụng cho bộ ba biến của chúng tôi. Chúng tôi có thể sửa đổi kết quả của my_func nhờ hàm lambda được trả về. 196

Chương 8

Chủ đề nâng cao I. Hiệu quả

THỨ BA BÀI TẬP 1. Điền vào chỗ trống. Điền vào chỗ trống cho đoạn mã sau để nó nhận tham số “x” và trả về “True” nếu nó lớn hơn 50; . >>> ____ x _ Đúng nếu x _ 50 ____ Sai

2. chuyển đổi bằng cấp. Viết hàm lambda nhận giá trị độ theo độ C và trả về độ được chuyển đổi thành độ F

Hôm nay chúng ta đã có thể hiểu được sự khác biệt giữa hàm thông thường và hàm ẩn danh, hay còn gọi là hàm lambda. Chúng hữu ích cho khả năng đọc và có thể cô đọng mã của bạn. Một trong những tính năng mạnh mẽ nhất của chúng là có thể cung cấp cho các chức năng nhiều khả năng hơn bằng cách trả về từ chúng

Thứ Tư. Ánh xạ, Lọc và Rút gọn Khi làm việc với dữ liệu, nhìn chung, bạn cần có khả năng sửa đổi, lọc hoặc tính toán một biểu thức từ dữ liệu. Đó là nơi các chức năng tích hợp quan trọng này phát huy tác dụng. Hàm bản đồ được sử dụng để lặp lại một bộ sưu tập dữ liệu và sửa đổi nó. Chức năng bộ lọc được sử dụng để lặp lại một bộ sưu tập dữ liệu và bạn đoán nó… lọc ra dữ liệu không đáp ứng điều kiện. Cuối cùng, hàm rút gọn lấy một bộ sưu tập dữ liệu và cô đọng nó thành một kết quả duy nhất, giống như hàm tính tổng cho danh sách. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay “Week_08” của chúng ta và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Bản đồ, Giảm và Lọc. ”

Bản đồ không có Lambdas Chức năng bản đồ được sử dụng khi bạn cần thay đổi tất cả các mục trong bộ sưu tập dữ liệu có thể lặp lại. Nó có hai đối số, hàm được áp dụng trên từng phần tử và dữ liệu có thể lặp lại. Khi sử dụng bản đồ, nó trả về một đối tượng bản đồ, là một trình vòng lặp. Đừng lo lắng về những gì bây giờ; . Hãy thử lấy một danh sách nhiệt độ độ C và chuyển đổi tất cả chúng thành độ F. 197

Chương 8

Chủ đề nâng cao I. Hiệu quả

1. # sử dụng chức năng bản đồ mà không cần lambdas 2. def convertDeg[C]. 3. trả về [9/5] * C + 32 4. nhiệt độ = [ 12. 5, 13. 6, 15, 9. 2 ] 5. convert_temps = map[convertDeg, temps] # trả về đối tượng bản đồ 6. in [converted_temps] 7. convert_temps = list[converted_temps] # gõ chuyển đổi đối tượng bản đồ vào danh sách các temps đã chuyển đổi 8. print[converted_temps] Tiếp tục và chạy ô. Câu lệnh in đầu tiên sẽ xuất ra “” hoặc một cái gì đó tương tự. Điều này là do hàm bản đồ trả về một đối tượng bản đồ, không phải là tập hợp dữ liệu đã chuyển đổi. Ở dòng 7, chúng ta có thể chuyển đổi đối tượng bản đồ thành một danh sách, dẫn đến kết quả là “[ 54. 5, 56. 48, 59, 48. 56 ]”. Khi bản đồ được gọi, chức năng bắt đầu lặp qua danh sách tạm thời được truyền vào. Khi nó lặp đi lặp lại, nó đã chuyển một mục duy nhất vào hàm convertDeg cho đến khi nó chuyển tất cả các mục vào. Tương đương với quá trình này là như sau. >>> cho mục trong temps. >>> convertDeg[item] Sau khi chuyển đổi, nó sẽ thêm dữ liệu vào đối tượng bản đồ. Cho đến khi chúng tôi chuyển đổi đối tượng bản đồ, chúng tôi mới có thể thấy nhiệt độ đã chuyển đổi

Lập bản đồ với Lambdas Bây giờ, chúng ta đã biết cách sử dụng bản đồ với hàm được xác định thông thường, lần này hãy thử với hàm lambda. Vì bản đồ yêu cầu một hàm làm tham số đầu tiên, chúng ta có thể chỉ cần lập trình một lambda thay cho tên của một hàm đã xác định. Chúng ta cũng có thể gõ chuyển đổi nó trên cùng một dòng. # sử dụng chức năng bản đồ với lambdas temps = [ 12. 5, 13. 6, 15, 9. 2 ] convert_temps = danh sách [ bản đồ [ lambda C. [9/5] * C + 32, temps] ] # gõ chuyển đổi đối tượng bản đồ ngay lập tức print[converted_temps] 198

Chương 8

Chủ đề nâng cao I. Hiệu quả

Đi trước và chạy tế bào. Chúng tôi sẽ nhận được đầu ra giống như chúng tôi đã làm trước đây nhưng với ít dòng mã hơn. Đây là vẻ đẹp của việc kết hợp hai khái niệm này. Hàm lambda nhận từng mục khi hàm bản đồ lặp qua danh sách tạm thời và trả về giá trị đã chuyển đổi. Quy trình tương tự mà chúng tôi đang thực hiện có thể được tìm thấy trong các dòng mã sau. >>> def convertDeg[độ]. >>> converted = [ ] >>> đối với độ theo độ. >>> kết quả = [9/5] * độ + 32 >>> quy đổi. nối thêm[kết quả] >>> return convert >>> temps = [ 12. 5, 13. 6, 15, 9. 2 ] >>> convert_temps = convertDeg[temps] >>> print[converted_temps] Như bạn có thể thấy, việc sử dụng các hàm lambda và bản đồ giúp giảm bớt các dòng mã được sử dụng khi chúng ta cần thay đổi dữ liệu của mình

Bộ lọc không có Lambdas Chức năng bộ lọc rất hữu ích để thu thập dữ liệu và xóa mọi thông tin mà bạn không cần. Giống như hàm bản đồ, nó nhận một hàm và kiểu dữ liệu có thể lặp lại và trả về một đối tượng bộ lọc. Đối tượng này có thể được chuyển đổi thành một danh sách làm việc giống như chúng ta đã làm với đối tượng bản đồ của mình. Hãy sử dụng cùng một dữ liệu và lọc ra bất kỳ độ nào không cao hơn 55 độ F. # sử dụng chức năng bộ lọc không có chức năng lambda, lọc ra các nhiệt độ dưới 55F def filterTemps[C]. đã chuyển đổi = [9/5] * C + 32 trả về Đúng nếu được chuyển đổi > 55 khác Sai # sử dụng toán tử bậc ba temps = [ 12. 5, 13. 6, 15, 9. 2 ] filtering_temps = filter[filterTemps, temps] # trả về đối tượng bộ lọc print[filtered_temps]filtered_temps = list[filtered_temps] # chuyển đổi đối tượng bộ lọc thành danh sách dữ liệu đã lọc print[filtered_temps] 199

Chương 8

Chủ đề nâng cao I. Hiệu quả

Đi trước và chạy tế bào. Kết quả đầu ra đầu tiên là “”, giống như đầu ra đối tượng bản đồ của chúng tôi. Câu lệnh thứ hai dẫn đến đầu ra của “[56. 48, 59]”. Khi chúng tôi sử dụng bộ lọc và chuyển vào tạm thời, nó sẽ lặp lại danh sách một mục tại một thời điểm. Sau đó, nó sẽ chuyển từng mục vào hàm filterTemps và cho dù kết quả trả về là Đúng hay Sai, nó sẽ thêm mục đó vào đối tượng bộ lọc. Phải đến khi chúng ta gõ chuyển đổi đối tượng thành danh sách thì chúng ta mới có thể xuất dữ liệu. Sử dụng hàm lambda có thể giảm bớt các dòng mã cần thiết hơn nữa

Lọc bằng Lambdas Hãy thực hiện các bước tương tự như trước đó, ngoại trừ lần này chúng ta sẽ sử dụng hàm lambda. # sử dụng chức năng lọc với các chức năng lambda, lọc ra các nhiệt độ dưới 55F tạm thời = [ 12. 5, 13. 6, 15, 9. 2 ]filter_temps = danh sách[ bộ lọc[ lambda C. Đúng nếu [9/5] * C + 32 > 55 khác Sai, tạm thời] ] # gõ chuyển đổi bộ lọc print[filtered_temps] Tiếp tục và chạy ô. Chúng tôi sẽ nhận được kết quả giống như chúng tôi đã làm trước đó, ngoại trừ lần này chúng tôi có thể giảm số lượng dòng được sử dụng với hàm lambda của mình. Quy trình tương tự mà chúng tôi đang thực hiện có thể được tìm thấy trong các dòng mã sau. >>> def convertDeg[độ]. >>> filtered = [ ] >>> đối với độ theo độ. >>> kết quả = [9/5] * độ + 32 >>> nếu kết quả > 55. >>> đã lọc. nối thêm [độ] >>> bộ lọc trả lại >>> temps = [ 12. 5, 13. 6, 15, 9. 2 ] >>>filtered_temps = convertDeg[temps] >>> print[filtered_temps] Giống như chức năng bản đồ sử dụng lambdas, việc ghép chức năng bộ lọc với lambda sẽ cắt giảm đáng kể mã của chúng tôi

200

Chương 8

Chủ đề nâng cao I. Hiệu quả

Sự cố với Reduce Mặc dù tôi sẽ chỉ cho bạn cách sử dụng hàm reduce, nhưng bạn nên hiểu rằng có một phương pháp tốt hơn là sử dụng hàm thực tế. Theo chính người tạo ra Python

Vì vậy, bây giờ hãy giảm [ ]. Đây thực sự là điều mà tôi luôn ghét nhất, bởi vì, ngoài một vài ví dụ liên quan đến + hoặc ∗, hầu như mỗi khi tôi thấy một lệnh gọi reduce[ ] với một đối số hàm không tầm thường, tôi lại phải lấy bút và giấy để . Vì vậy, theo suy nghĩ của tôi, khả năng áp dụng của reduce[ ] bị giới hạn khá nhiều đối với các toán tử kết hợp và trong tất cả các trường hợp khác, tốt hơn hết là viết ra vòng lặp tích lũy một cách rõ ràng. 1 Theo cách nói của anh ấy, anh ấy đang nói rằng reduce chỉ phục vụ một số mục đích, ngoài ra, nó vô dụng, vì vậy sẽ hợp lý hơn khi sử dụng vòng lặp for đơn giản. Hãy xem xét cả hai ví dụ

Lưu ý Giảm là một chức năng tích hợp sẵn trong Python 2, kể từ đó nó đã được chuyển vào thư viện funcools

U sing Giảm Hàm rút gọn chấp nhận hai đối số, hàm để thực hiện và bộ sưu tập dữ liệu để lặp lại. Tuy nhiên, không giống như bộ lọc và bản đồ, reduce lặp lại hai mục cùng một lúc thay vì một mục. Kết quả của reduce là luôn trả về một kết quả duy nhất. Trong ví dụ sau, chúng tôi muốn nhân tất cả các số với nhau. Hãy sử dụng giảm để thực hiện ví dụ này. # cho mục đích thông tin, đây là cách bạn sử dụng hàm rút gọn từ funcools import reduce nums = [ 1, 2, 3, 4 ] result = reduce[ lambda a, b. a * b, nums ] # kết quả là 24 bản in[kết quả]

www. nghệ thuật. com/weblog/viewpost. jsp?thread=98196

201

Chương 8

Chủ đề nâng cao I. Hiệu quả

Đi trước và chạy tế bào. Đầu ra sẽ là 24. Khi hàm rút gọn nhận hai đối số, nó sẽ rút gọn danh sách nums thành một giá trị trả về duy nhất. Trong phần sau, bạn sẽ thấy cách được đề xuất để thực hiện quy trình tương tự. >>> tổng = 0 >>> với n ở dạng số. >>>total = total * n Đối với hầu hết các trường hợp, thật dễ hiểu tại sao Rossum lại kiên quyết đề xuất các vòng lặp for thay vào đó, vì reduce có thể trở nên khó hiểu khi bạn thử thu thập dữ liệu phức tạp hơn như danh sách trong danh sách

THỨ TƯ BÀI TẬP 1. tên ánh xạ. Sử dụng hàm lambda và map để ánh xạ qua danh sách các tên sau đây để tạo ra kết quả sau “[ “Ryan”, “Paul”, “Kevin Connors” ]. >>> tên = [ " ryan", "PAUL", "kevin Connors

" ]

2. tên bộ lọc. Sử dụng hàm lambda và bộ lọc, lọc ra tất cả các tên bắt đầu bằng chữ cái “A. ” Làm cho nó không phân biệt chữ hoa chữ thường, để nó lọc ra tên cho dù nó có viết hoa hay không. Đầu ra của danh sách sau phải là [ “Frank”, “Ripal” ]. >>> tên = [ "Amanda", "Frank", "abby", "Ripal", "Adam" ]

Hôm nay chúng ta đã học về một số chức năng tích hợp quan trọng mà chúng ta có thể sử dụng khi làm việc với dữ liệu trong Python. Kết hợp bản đồ và bộ lọc với lambdas giúp cải thiện khả năng đọc mã của chúng tôi và rút ngắn các dòng mã cần thiết. Cuối cùng, reduce có thể hữu ích trong một số trường hợp;

202

Chương 8

Chủ đề nâng cao I. Hiệu quả

thứ năm. Hàm đệ quy và Ghi nhớ Đệ quy là một khái niệm trong lập trình trong đó một hàm tự gọi chính nó một hoặc nhiều lần trong khối của nó. Tuy nhiên, các loại chức năng này thường có thể gặp sự cố về tốc độ do chức năng liên tục tự gọi chính nó. Ghi nhớ giúp quá trình này bằng cách lưu trữ các giá trị đã được tính toán để sử dụng sau này. Trước tiên chúng ta hãy hiểu thêm về các hàm đệ quy. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_08” và chỉ cần thêm một ô đánh dấu ở dưới cùng có nội dung “Hàm đệ quy và Ghi nhớ. ”

Hiểu các hàm đệ quy Tất cả các hàm đệ quy đều có cái được gọi là “trường hợp cơ sở” hoặc điểm dừng. Giống như các vòng lặp, bạn cần một cách để thoát khỏi cuộc gọi đệ quy. Không có cái nào bạn tạo ra một vòng lặp vô tận mà cuối cùng sẽ sụp đổ. Ví dụ: hãy tưởng tượng chúng ta đặt trường hợp cơ bản là 1 cho các câu hỏi sau. 1. Bạn có thể tính tổng của 5 không? . Bạn có thể tính tổng của 5 ∗ 4 không? . Bạn có tính được tổng của 5∗4∗3 không? . Bạn có thể tính tổng của 5∗4∗3∗2 không? . Bạn có thể tính tổng của 5∗4∗3∗2∗1 không? . Vâng, chúng tôi đã đạt đến trường hợp cơ bản của mình; . Trong ví dụ này, chúng tôi đã bắt đầu cuộc gọi đệ quy của mình lúc 5 giờ và muốn đạt được trường hợp cơ sở của mình trước khi chúng tôi tính tổng. Trên mỗi cuộc gọi mới, chúng tôi thêm một số vào biểu thức, là số trước đó trừ đi một. Đây là một ví dụ về hàm giai thừa thực hiện một cuộc gọi đệ quy. Tùy thuộc vào nhiệm vụ, các chức năng có thể thực hiện hai cuộc gọi đệ quy cùng một lúc. Ví dụ rõ ràng nhất về điều này là dãy Fibonacci. Chúng tôi sẽ lập trình cả hai cùng nhau. Bạn có thể tự hỏi mình, những thứ này hữu ích như thế nào? . Vậy tại sao lại sử dụng chúng? . Chúng được sử dụng thường xuyên trong các thuật toán tìm kiếm và sắp xếp do các nhiệm vụ lặp đi lặp lại xảy ra. 203

Chương 8

Chủ đề nâng cao I. Hiệu quả

Hãy tưởng tượng bạn cần tìm kiếm trong một mảng 4 chiều, còn được gọi là danh sách trong danh sách trong danh sách trong danh sách. Thay vì viết một loạt các vòng lặp for để lặp qua từng danh sách, bạn có thể viết một hàm đệ quy gọi chính nó mỗi khi tìm thấy một chiều mới. Mã sẽ tạo ra ít dòng hơn và dễ đọc hơn. Hãy kiểm tra một số ví dụ

Viết hàm giai thừa Giai thừa là một trong những ví dụ đơn giản hơn về đệ quy vì chúng là kết quả của một số đã cho nhân với tất cả các số trước đó cho đến khi đạt đến số không. Hãy thử lập trình nó. # viết giai thừa bằng hàm đệ quy def giai thừa[n]. # thiết lập trường hợp cơ sở của bạn. nếu n Danh sách tìm kiếm[ [ 2, 3, [ 18, 22 ], 6 ], 22 ]

Hôm nay, chúng ta đã học tất cả về các hàm đệ quy và cách cải thiện chúng với khái niệm ghi nhớ. Chúng tôi có thể sử dụng một kỹ thuật bộ nhớ đệm đơn giản để lưu trữ các giá trị được tính toán trước đó. Các hàm đệ quy có thể hữu ích khi sử dụng chúng hợp lý, nhưng trong hầu hết các trường hợp, một vòng lặp for đơn giản là đủ, vì các hàm đệ quy có thể trở nên chậm theo thời gian

Thứ sáu. Viết Tìm kiếm nhị phân Dự án của tuần này là để hiểu một trong những thuật toán hiệu quả hơn trong lập trình… Tìm kiếm nhị phân. Khi cần tìm kiếm một danh sách đầy dữ liệu, bạn cần thực hiện hiệu quả. Có thể không hợp lý khi tạo một thuật toán cho danh sách mười mục nhưng hãy tưởng tượng nếu đó là một triệu mục. Bạn không muốn tìm kiếm từng mục trong danh sách để thử và tìm những gì bạn đang tìm kiếm. Thay vào đó, chúng tôi sử dụng các thuật toán như Tìm kiếm nhị phân để thực hiện các tác vụ này. Để theo dõi bài học này, hãy tiếp tục từ tệp sổ tay trước đó của chúng ta “Week_08” và thêm một ô đánh dấu ở dưới cùng có nội dung “Dự án Thứ Sáu. Viết tìm kiếm nhị phân. ”

Thiết kế cuối cùng Mặc dù bản thân chương trình sẽ tương đối nhỏ, nhưng chúng ta phải hiểu cách thức hoạt động của thuật toán cho Tìm kiếm nhị phân. Đối với khái niệm thiết kế của chúng tôi trong tuần này, chúng tôi sẽ đưa ra các bước mà chúng tôi cần tuân theo. Hãy nhớ rằng các thuật toán không gì khác hơn là một tập hợp các bước. Tìm kiếm nhị phân không khác. Mỗi bước cho thuật toán này như sau

209

Chương 8

Chủ đề nâng cao I. Hiệu quả

1. Sắp xếp danh sách. 2. Tìm chỉ số giữa. 3. Kiểm tra giá trị tại chỉ số giữa; . 4. Kiểm tra giá trị tại chỉ số giữa; . 5. Kiểm tra giá trị tại chỉ số giữa; . 6. Lặp lại các bước từ 2 đến 6 cho đến khi danh sách trống. 7. Nếu vòng lặp while kết thúc, nghĩa là không còn mục nào, vì vậy hãy trả về Sai. Hãy cùng nhau xem qua một ví dụ với các đối số sau. [ 14, 0, 6, 32, 8 ], và chúng ta sẽ tìm số 14. Xem Bảng 8-1 để biết hướng dẫn từng bước

Bảng 8-1. Mô tả ví dụ tìm kiếm nhị phân Bước

Giá trị của biến

Description

Mã số

danh sách. [0, 6, 8, 14, 32]

Sắp xếp danh sách ngay lập tức

danh sách. loại[ ]

giữa. 2

Tìm ở giữa, 5/2, làm tròn xuống

len[danh sách] // 2

giá trị. số 8

Không trả về Đúng, 8 không phải là 14

danh sách[2]

tình trạng. Sai

8 nhỏ hơn 14 đừng chạy khối

nếu danh sách[2] > 14

danh sách. [14, 32]

Chạy khối, cắt nửa đầu danh sách

danh sách = danh sách [giữa + 1. ]

giữa. 1

Chỉ số giữa là 1 vì 2/2

len[danh sách] // 2

giá trị. 32

Không trả về Đúng, 32 không phải là 14

list[1]

list. [14]

Run block, cut off second half of list

danh sách = danh sách [. mid - 1]

mid. 0

Find the middle, 1 / 2, round down

len[danh sách] // 2

return True

Value at mid index is 14 return True

return True

210

Chương 8

Chủ đề nâng cao I. Hiệu quả

A linear search would require us to search the list item by item to see if the number we’re looking for was in the list. When thinking about efficiency and how long a search may take to complete the task, it would be based on the length of the list. As the length of the list grows, so does the time it takes to find the number we’re looking for. With a Binary Search, however, the time it takes to find a number within a list only takes a minimal number of steps even when the list is a million numbers. For example, when you search a list of one million numbers, a linear search could take one million tries to find the number, but a Binary Search would be able to find it within 20 guesses. As it searches, it cuts the list in half. Within 10 guesses you’re already working with a list of under 2,000 items. This is the beauty of an efficient algorithm. Let’s walk through each step together to understand how the algorithm is programmed

Program Setup Before we begin to write our algorithm, we need to set up a way to generate a random list of numbers. Let’s import the random module and use list comprehension to generate some data. 1. # setting up imports and generating a list of random numbers to work with 2. import random 4. nums = [ random. randint[0, 20] for i in range[10] ] # create a list of ten numbers between 0 and 20 6. print[ sorted[nums] ] # for debugging purposes Go ahead and run the cell. We import the random module in order to generate a list of 20 random numbers with our list comprehension. For debugging purposes, we output a sorted version of nums on line 6 in order to see the data that we’ll be working with

Step 1. Sort the List The first step in the algorithm is to sort the list. Generally, you sort the list before passing it in, but we want to take all precautions that this algorithm works even with unsorted lists. Let’s begin by defining the function definition, as well as sorting the list passed in

211

Chương 8

Chủ đề nâng cao I. Hiệu quả

4. nums = [ random. randint[0, 20] for i in range[10] ] # create a . ◽◽◽ 6. def binarySearch[aList, num]. 7. # step 1. sort the list 8. aList. sort[ ] 10. print[ sorted[nums] ] # for debugging purposes 12. print[ binarySearch[nums, 3] ] We’ve added the function call at the bottom and will be printing the returned value, but for now nothing will happen when you run the cell. Let’s move on to step 2

Step 2. Find the Middle Index In this step, we need to find the middle index. I’m not talking about the value of the item in the middle of the list but rather the actual index number. If we’re searching a list of one million items, the middle index would be 500,000. The value at that index could be any number, but again, that’s not what this step is for. Let’s write out the second step. 8. aList. sort[ ] ◽◽◽ 10. # step 2. find the middle index 11. mid = len[aList] // 2 # two slashes means floor division – round down to the nearest whole num 13. print[mid] # remove once working 15. print[ sorted[nums] ] # for debugging purposes ◽◽◽ Go ahead and run the cell. In order to find the middle index, we need to divide the length of the list by two and then round down to the nearest whole number. We need to use whole numbers because an index is only ever a whole number. You could never access index 1. 5. Also, we round down because rounding up would cause index out of range errors. For example, if there is one item within the list, then 1 / 2 = 0. 5 and rounding up to one would cause an error, as the single item within the list is at index zero. The output will result in 5, as we’re working with a list of ten numbers. Go ahead and remove the print statement at line 13 when you’re done

212

Chương 8

Chủ đề nâng cao I. Hiệu quả

Step 3. Check the Value at the Middle Index Now that we have the middle index, we want to see if the value at that given index is the number that we’re looking for. If it is, then we want to return True. 11. mid = len[aList] // 2 # two slashes . ◽◽◽ 13. # step 3. check the value at middle index, if it is equal to num return True 14. if aList[mid] == num. 15. return True 17. print[ sorted[nums] ] # for debugging purposes ◽◽◽ Go ahead and run the cell. You’ll get an output of either True or None, depending on the list that was randomly generated for you. If the number 3 appears at index 5, then your output will be True as our condition on line 14 is True and will run the return statement

Step 4. Check if Value Is Greater If the number that we’re looking for isn’t at the middle index, then we need to figure out which half of the list to remove. Let’s first check if the value at the middle index is greater than the number we’re searching for. If it is, we need cut off the right half of the list. 15. return True ◽◽◽ 17. # step 4. check if value is greater, if so, cut off right half of list using slicing 18. elif aList[mid] > num. 19. aList = aList[ . mid ] 21. print[aList] # remove after working properly 23. print[ sorted[nums] ] # for debugging purposes ◽◽◽ Go ahead and run the cell. On line 18 we check to see if the value at the middle index of the list is greater than the argument that we passed in during the function call. Line 19 is where the magic of Binary Search occurs though. Using slicing, we’re able to re-declare the value of aList to the beginning half of the list. 213

Chương 8

Chủ đề nâng cao I. Hiệu quả

Note Remember that slicing allows you to input the start, stop, and step. If you don’t input a number like earlier, it implies that you are using default values. Default values are start = 0, stop = len[list], and step = 1. We imply that we want to keep the all items from index zero up to the middle index. Remove line 21 after you’re done, as it will simply output the result of our new aList

Step 5. Check if Value Is Less This step is the exact same as step 4 but with the opposite condition. If the value at the middle index is less than the number we’re looking for, we want to remove the left half. 19. aList = aList[ . mid ] ◽◽◽ 21. # step 5. check if value is less, if so, cut off left half of list using slicing 21. elif aList[mid] < num. 22. aList = aList[ mid + 1 . ] 23. print[aList] # remove after working properly 25. print[ sorted[nums] ] # for debugging purposes ◽◽◽ Go ahead and run the cell. On line 22 we perform the opposite slice from step 4. This time we declare “mid + 1” because we don’t want to include the middle index, as it’s already been checked. The logic has now been implemented for our Binary Search. All that’s left is to set up a loop to repeat steps 2 through 5 and return False if we don’t find what we’re looking for

Step 6. Set Up a Loop to Repeat Steps We’ll need to loop until the argument is found, or until the list is empty. This sounds like a great case for a while loop. After creating the while statement, we need to make sure we execute the code for steps 2 through 5 within the loop

214

Chương 8

Chủ đề nâng cao I. Hiệu quả

8. aList. sort[ ] ◽◽◽ 10. # step 6. setup a loop to repeat steps 2 through 6 until list is empty 11. while aList. 12. mid = len[aList] // 2 14. if aList[mid] == num. 15. return True 16. elif aList[mid] > num. 17. aList = aList[ . mid ] 18. elif aList[mid] < num. 19. aList = aList[ mid + 1 . ] 21. print[aList] # remove after working properly 21. print[ sorted[nums] ] # for debugging purposes ◽◽◽ Go ahead and run the cell. Our Binary Search is now performing all the necessary steps to either return True when the argument is found or create an empty list, in which case the loop will end. Remember that our preceding while statement is the same as “while len[aList] > 0. ”. All that’s left is to return False if the loop ends, as that means that the list does not contain our number

Step 7. Return False Otherwise To complete our Binary Search, we simply need to return False after the while loop ends. 19. aList = aList[ mid + 1 . ] ◽◽◽ 21. # step 7. return False, if it makes it to this line it means the list was empty and num wasn’t found 22. return False 24. print[ sorted[nums] ] # for debugging purposes ◽◽◽ Go ahead and run the cell. We’ve now completed the Binary Search algorithm. Now when you run the cell, you’ll get an output of either True or False. Feel free to print out the list within the while loop, so you can see how the list is being truncated on each step

215

Chương 8

Chủ đề nâng cao I. Hiệu quả

Final Output You can find all the code for this week, as well as this project in the Github repository. The final output in the following won’t include any of the comments we added in previous blocks so that you may see the complete version unobstructed. 1. # full output of binary search without comments 2. import random 4. nums = [ random. randint[0, 20] for i in range[10] ] 6. def binarySearch[aList, num]. 7. aList. sort[ ] 9. while aList. 10. mid = len[aList] // 2 12. if aList[mid] == num. 13. return True 14. elif aList[mid] > num. 15. aList = aList[ . mid ] 16. elif aList[mid] < num. 17. aList = aList[ mid + 1 . ] 19. return False 21. print[ sorted[nums] ] 22. print[ binarySearch[nums, 3] ] Go ahead and run the cell. If you ran into any problems, be sure to reference this code. Try increasing the number of items within the list you pass in and see how quickly it can find your number. Even on large lists, this algorithm will execute with extreme speed

Today was important in understanding not only how Binary Search works, but how we can program an algorithm from a set of step-by-step instructions. Algorithms can be simple to understand, yet difficult to translate into code. Using this algorithm, we can begin to understand how searches can be efficient, even when there are large amounts of data to sift through

216

Chương 8

Chủ đề nâng cao I. Hiệu quả

Weekly Summary Throughout this week, we were able to go over some of the more advanced topics within Python. As you begin to build your programming experience, you should always be thinking about efficiency. First and foremost, we need to make sure that our programs are correct in their execution, but then we need to be aware of their speed. If an algorithm or program could give you the price of a stock to the cent, but it took ten years to execute, it would be worthless. That’s the importance of a great algorithm. Along with efficiency, we want to keep in mind the readability of our code. Although sing list comprehension, lambdas, and recursive functions don’t improve the speed of our program, it helps to improve our ability to read what’s happening. During the lessons next week, we’ll be covering algorithmic complexity and the importance of performance when using certain data types

Challenge Question Solution In the following, you can find the solution to the challenge question this week. 1. # ask user for input, return whether it is prime or not 3. def isPrime[num]. 4. for i in range[ 2, int[num**0. 5] + 1 ]. 5. if num % i == 0. 6. return False 7. else. 8. return True 10. n = int[ input["Type a number. "] ] 12. if isPrime[n]. 13. print["That is a prime number. "] 14. else. 15. print["That is not a prime number"] The most important part of this program is on line 4. Although you may have gotten it correct, we wanted to create this program so that it was efficient. The statement on line 4 could have also looked like the following. >>> for i in range[2, num]. 217

Chương 8

Chủ đề nâng cao I. Hiệu quả

The problem with this line, however, is that it’s not efficient. When you are trying to calculate whether a number is prime or not, the square root of the number is as high as you need to go. If a number isn’t divisible between two and the square root of itself, then it means it’s a prime number. If we didn’t take the square root of the number passed in to calculate prime, then we would’ve had to loop all the way to the prime number itself. Let’s take the number 97, for instance, which is a prime number. Using the second for loop statement, we would’ve looped for a total of 96 iterations. With the statement written in the code block, however, we would only loop for a total of nine iterations. As the number you’re passing in gets larger, so too does the iteration count. Therefore, it’s always important to keep efficiency in mind when programming

Weekly Challenges To test out your skills, try these challenges. 1. Recursive Binary Search. Turn the Binary Search algorithm that we created together into a recursive function. Rather than using a while loop, it should call itself in order to cut the list down and eventually return True or False. 2. Efficient Algorithms. Looking at the Binary Search we wrote, how could you possibly make it even more efficient? 3. Case-Sensitive Search. Rewrite the Binary Search so that it works with a list that holds both numbers and letters. It should be case sensitive. Use the following function call to understand the parameters being passed in. Hint. “22” < ‘a’ will return True. >>> binarySearch[ [ 'a', 22, '3', 'hello', 1022, 4, 'e' ] , 'hello'] # returns True

218

CHAPTER 9

Advanced Topics II. Complexity This week is the continuation of advanced python concepts and will cover more topics that a developer has to understand on the job. To begin the week, we’ll cover a concept that you’ve been using this whole time, generators and iterators. Over the following couple of days, we’ll cover decorators and modules, which will help us in building larger-scale applications. These concepts will help to understand how frameworks are used, like Flask and Django. Although I don’t like talking about theory within this book, it’s important to understand how time complexity works with algorithms. On Thursday, we’ll dive into Big O Notation and understanding algorithms further. All the lessons within the book have led you to the point of being able to further your education into becoming a Python developer. This all leads us into our Friday project, which is interview prep. As this book is set up as a tool for improving or changing your career, an important piece of that is the interview process. There will be information about the process, what to expect, and how to handle some interview questions that you may be asked. Overview •

Understanding generator and iterator objects

•

Using and applying decorators

•

Creating and importing modules

•

What is time complexity and Big O Notation?

•

Knowing how to handle interviews, questions, and more

219

Chapter 9

Advanced Topics II. Complexity

CHALLENGE QUESTION As a programmer you must think about the time it takes to execute a program. Even a program that will give you 100% accurate answers can be useless if it doesn’t give the answer to you in time. Without looking it up, do you think lists or dictionaries are more efficient when needing to retrieve and store information?

Monday. Generators and Iterators In previous sections of this book, you may have seen the words generators or iterators mentioned. Without knowing, you’ve been using them the entire time. Today, we’ll dive into what each of these concepts are and how to use them. To follow along with the content for today, let’s open up Jupyter Notebook from our “python_bootcamp” folder. Once it’s open, create a new file, and rename it to “Week_09. ” Next, make the first cell markdown that has a header saying. “Generators and Iterators. ” We’ll begin working underneath that cell

Iterators vs. Iterables An iterator is an object that contains items which can be iterated upon, meaning you can traverse through all values. An iterable is a collection like lists, dictionaries, tuples, and sets. The major difference is that iterables are not iterators; rather they are containers for data. In Python, iterator objects implement the magic methods iter and next that allow you to traverse through its values

Creating a Basic Iterator We can create iterators easily from iterables. You can simply use the iter[] function to do so. 1. 3. 5. 7. 220

# creating a basic iterator from an iterable sports = [ "baseball", "soccer", "football", "hockey", "basketball" ] my_iter = iter[sports] print[ next[my_iter] ] # outputs first item

Chapter 9

Advanced Topics II. Complexity

8. print[ next[my_iter] ] # outputs second item 10. for item in my_iter. 11. print[item] 13. print[ next[my_iter] ] # will produce error Go ahead and run the cell. Iterators will always remember the last item that they returned, which is why we get an error on line 13. Using the next[] method, we’re able to output the next item within the iterator. Once all the items within the iterator have been used, however, we can no longer traverse through the iterator, as there are no more items left. Iterators are great for looping as well, and like lists and dictionaries, we can simply use the in keyword [see line 10]. You can still loop over the list like we normally do, and it will always begin from index 0, but once our iterator is out of items, we can no longer use it

Creating Our Own Iterator Now that we’ve seen how to create an iterator from a Python iterable, let’s create our own iterator class that will output each letter in the alphabet. To create an iterator, we’ll need to implement the magic methods __iter__[] and __next__[]. 1. # creating our own iterator 3. class Alphabet[ ]. 4. def __iter__[self]. 5. self. letters = "abcdefghijklmnopqrstuvwxyz" 6. self. index = 0 7. return self 9. def __next__[self]. 10. if self. index >> for i in RevIter[ [ 1, 2, 3, 4, 5 ] ]

2. Squares. Create a generator that acts like the range function, except it yields a squared number every time. The result of the following call should be “0, 1, 4, 16”. >>> for i in range[4]

223

Chapter 9

Advanced Topics II. Complexity

Today we were able to understand how to build our own range function, as well as how data collections can be iterated over. Generators are simplified version of iterators but use the yield keyword to return information. Iterators must always be created by using the iter and next methods and are useful for creating our own sequence for iterating

Tuesday. Decorators If you want to learn about frameworks, or understand how to improve functions within Python, then you need to understand what a decorator is and how it works. It will help to simplify our code as well as reduce the lines necessary to improve our programs. To follow along with this lesson, let’s continue from our previous notebook file “Week_09” and simply add a markdown cell at the bottom that says “Decorators. ”

What Are Decorators? Decorators, also known as wrappers, are functions that give other functions extra capabilities without explicitly modifying them. They are denoted by the “@” symbol in front of the function name, which is written above a function declaration like the following. >>> @decorator >>> def normalFunc[ ]. Decorators are useful when you want to perform some functionality before or after a function executes. For example, let’s imagine you wanted to restrict access to a function based on a user being logged in. Rather than writing the same conditional statement for every function you create, you could put the code into a decorator and apply the decorator onto all functions. Now, whenever a function is called, the conditional statement will still run, but you were able to save yourself several lines. This is a real-life example for the Flask framework, which restricts access to certain pages based on user authentication using decorators. We’ll see a minimal example of this later today

224

Chapter 9

Advanced Topics II. Complexity

Higher-Order Functions A higher-order function is a function that operates on other functions, either by taking a function as its argument or by returning a function. We saw this done in last week’s lesson with lambdas, map, filter, and reduce. Decorators are higher-order functions because they take in a function and return a function

Creating and Applying a Decorator We’ll need to declare a function that takes in another function as an argument in order to create a decorator. Inside of this decorator, we can then define another function to be returned that will run the function that was passed in as an argument. Let’s see how this is written. 1. # creating and applying our own decorator using the @ symbol 3. def decorator[func]. 4. def wrap[ ]. 5. print["======"] 6. func[ ] 7. print["======"] 8. return wrap 10. @decorator 11. def printName[ ]. 12. print["John. "] 14. printName[ ] Go ahead and run the cell. We’ll get an output of “John. ” with equal signs above and below the name that act as a border. On line 10 we attached our decorator to the printName function. Whenever the printName function is called, the decorator will run, and printName will be passed in as the argument of “func”. Within decorator we declare a function called wrap. This wrap function will print a border, then call the func argument, and then print another border. Remember that decorators must return a function in order to run. Our decorator that we declared can be attached to any function that we write. All functions with this decorator will simply run with a border above and below them. 225

Chapter 9

Advanced Topics II. Complexity

Decorators with Parameters Although decorators simply add extra capabilities to functions, they can also have arguments like any other function. Let’s take the following example where we want to run a function x times. 1. # creating a decorator that takes in parameters 3. def run_times[num]. 4. def wrap[func]. 5. for i in range[num]. 6. func[ ] 7. return wrap 9. @run_times[4] 10. def sayHello[ ]. 11. print["Hello. "] Go ahead and run the cell. This cell will output “Hello. ” four times. The syntax changes when the decorator accepts an argument. Our decorator this time accepted an argument of num, and the wrap function accepted the function as the argument this time. Within our wrap function, we created a for loop that would run the function attached to our decorator as many times as the argument declared on the decorator on line 9

Note When passing an argument into a decorator, the function is automatically run, so we do not need to call sayHello in this instance

Functions with Decorators and Parameters When you need a function to accept arguments, while also having a decorator attached to it, the wrap function must take in the same exact arguments as the original function. Let’s try it. 1. # creating a decorator for a function that accepts parameters 3. def birthday[func]. 226

Chapter 9

Advanced Topics II. Complexity

4. def wrap[name, age]. 5. func[name, age + 1] 6. return wrap 8. @birthday 9. def celebrate[name, age]. 10. print[ "Happy birthday { }, you are now { }. ". format[name, age] ] 12. celebrate["Paul", 43] Go ahead and run the cell. This will output a nicely formatted string with the information passed in on line 12. When we call celebrate, the decorator takes in celebrate as the argument of func, and the two arguments “Paul” and “43” get passed into wrap. When we call our function within wrap, we pass the same arguments into the function call; however, we increment the age parameter by one

Restricting Function Access You’re probably wondering how decorators can serve a purpose, since the last few cells seem meaningless. For each one of them, we could have simply added those lines within the original function. That was just for syntax understanding though. Decorators are used a lot with frameworks and help to add functionality to many functions that you’ll write within them. One example is being able to restrict access of a page or function based on user login credentials. Let’s create a decorator that will help to restrict access if the password doesn’t match. 1. # real world sim, restricting function access 3. def login_required[func]. 4. def wrap[user]. 5. password = input["What is the password?"] 6. if password == user["password"]. 7. func[user] 8. else. 9. print["Access Denied"] 10. return wrap 12. @login_required 13. def restrictedFunc[user]. 227

Chapter 9

Advanced Topics II. Complexity

14. print[ "Access granted, welcome { }". format[user[ "name" ]] ] 16. user = { "name" . "Jess", "password" . "ilywpf" } 18. restrictedFunc[user] Go ahead and run the cell. On line 13 we declared a normal function that would take in a user and output a statement with their name and accessibility. Our decorator was attached on line 12 so that when we call restrictedFunc and pass in our created user, it would run through the decorator. Within the wrap function, we ask the user for a password and check whether the password is correct or not on line 6. If they type in the correct password, then we allow them to access the function and print out “Access Granted”. However, if the password is incorrect, then we output “Access Denied” and never run restrictedFunc. This is a simple example of how Flask handles user restrictions for pages, but it proves the importance of decorators. We can now attach login_required to any of the functions that we feel should be accessed only by users

TUESDAY EXERCISES 1. User Input. Create a decorator that will ask the user for a number, and run the function it is attached to only if the number is less than 100. The function should simply output “Less than 100”. Use the function declaration in the following. >>> @decorator >>> def numbers[ ]. >>> print["Less than 100"]

2. Creating a Route. Create a decorator that takes in a string as an argument with a wrap function that takes in func. Have the wrap function print out the string, and run the function passed in. The function passed in doesn’t need to do anything. In Flask, you can create a page by using decorators that accept a URL string. Use the function declaration in the following to start. >>> @route["/index"] >>> def index[ ]. >>> print["This is how web pages are made in Flask"]

228

Chapter 9

Advanced Topics II. Complexity

Today was an important lesson in preparation for other technologies that use Python, such as frameworks. Decorators help to improve function execution and can be attached to any function necessary. This helps to reduce code and give improved functionality

Wednesday. Modules Most programs tend to include so many lines of code that you wouldn’t store it all within a single file. Instead you separate the code into several files, which helps to keep the project organized. Each one of these files is known as modules. Within these modules are variables, functions, classes, etc. , that you can import into a project. Luckily, Python has a large following of developers that create modules for us to use in order to enhance our own projects. Today, we’ll look at some modules that are included with Python, how to import them, how to use them, and how to write our own modules to be used within Jupyter Notebook. To follow along with this lesson, let’s continue from our notebook file “Week_09” and simply add a markdown cell at the bottom that says, “Modules. ”

Importing a Module For the next few examples, we’ll be working with the math module, which is one of Python’s built-in modules. This specific module has functions and variables to help us with any problem related to math, whether it’s rounding, calculating pi, or many other math-related tasks. For this first cell, we’re going to import the entire math module and its contents. # import the entire math module import math print[ math. floor[2. 5] ] # rounds down print[ math. ceil[2. 5] ] # rounds up print[math. pi]

229

Chapter 9

Advanced Topics II. Complexity

Go ahead and run the cell. We’ll get an output of “2”, “3”, and “3. 14”. When we imported math, we were able to access all of math’s functions, variables, and classes. In this example, we call two functions and one variable that are stored within the math module. In order to import the entire module and its contents, you simply put the keyword import before the name of the module. Whenever you’d like to access any of its contents, you need to use dot syntax. Now we can use any of math’s code

Importing Only Variables and Functions When you know that you won’t need to use the entire module, but rather a couple functions or variables, you can import them directly. You should always make sure you import only what you need. In the previous cell, we imported the entire math module; however, we didn’t really need to, as we only used two functions and a variable from it. To import something specifically, you’ll need to include the from keyword and the name of what you’d like to import. # importing only variables and functions rather than an entire module, better efficiency from math import floor, pi print[ floor[2. 5] ] # print[ ceil[2. 5] ] will cause error because we only imported floor and pi, not ceil and not all of math print[pi] Go ahead and run the cell. We’ll get an output of “2” and “3. 14”. The import statement changes slightly when importing specific parts of the module. To separate multiple imports from a single module, you use a comma. We comment out the print statement for ceil because it won’t work. We only imported floor and pi directly, but not the ceil function. Notice that we don’t need to reference the math module with dot syntax before the names either. This is because we imported the floor function and pi variable directly, so we can now reference them without using dot syntax. Remember to only import what you need

Note You can import classes from modules the same way as earlier; simply use the name of the class. 230

Chapter 9

Advanced Topics II. Complexity

U sing an Alias Often, the name of what you’d like to import can be lengthy. Rather than having to write out an entire name each time you’d like to use it, you can give an “alias” or nickname when importing. # using the 'as' keyword to create an alias for imports from math import floor as f print[ f[2. 5] ] Go ahead and run the cell. We’ll get the same output as we do in the previous two cells, except this time we were able to reference the floor function as just the letter “f “. This is because of how we wrote our import statement using the “as” keyword. You can rename anything that is imported, although it’s generally best to only do so on larger names

Creating Our Own Module Now that we know how to import and call a module, let’s create our own. Go ahead and open any text editor you have on your computer like Notepad or TextEdit. Write the following code in the file, and save it within the same folder that your “Week_09” file is located, with the name “test. py”. If the two files aren’t in the same directory, it produces an error. # creating our own module in a text editor # variables to import later length = 5 width = 10 # functions to import later def printInfo[name, age]. print[ "{ } is { } years old. ". format[name, age] ] See Figure 9-1 for an example of what the code will look like within a text editor

231

Chapter 9

Advanced Topics II. Complexity

Figure 9-1. test. py module with code in text editor [notepad++] You’ve just written your first module. Remember that modules are nothing more than code written in other files that we can import in any of our projects. Now let’s see how to use them

Using Our Module in Jupyter Notebook In any other circumstance, you’d import the variables and function we wrote in test. py with the import and from keywords. Jupyter Notebook, however, works a little differently when using modules that you’ve created. We’ll use the “run” command in order to load in the entire module that we’ve created. After we run the file, we can use the variables and functions that we wrote within the module. Let’s check out how to do so. # using the run command with Jupyter Notebook to access our own modules %run test. py print[length, width] printInfo["John Smith", 37] # able to call from the module because we ran the file in Jupyter above Go ahead and run the cell. You’ll notice that we’re able to output the variables and function print statement that we declared within our test. py module. Keep in mind that the run command runs the file as if it were a single cell. Any function calls or print statements within our module would run immediately. To test this out, try putting a print statement at the bottom of the module. When you work in a development environment [IDE], you’ll write the import as you would normally, like the following. >>> from test import length, width, printInfo 232

Chapter 9

Advanced Topics II. Complexity

This is just how Jupyter Notebook works with files that we create

Note You can place any modules you create within the Python folder on your hard drive. Once the files are there, they can be accessed normally rather than using the run command

WEDNESDAY EXERCISES 1. Time Module. Import the time module and call the sleep function. Make the cell sleep for 5 seconds, and then print “Time module imported”. Although we haven’t covered this module, this exercise will provide good practice for you to try and work with a module on your own. Feel free to use Google, Quora, etc. 2. Calculating Area. Create a module named “calculation. py” that has a single function within it. That function should take in two parameters and return the product of them. We can imagine that we’re trying to calculate the area of a rectangle and it needs to take in the length and width properties. Run the module within Jupyter Notebook, and use the following function call within the cell. >>> calcArea[15, 30]

Today’s focus was all about modules, how to import them, how to use them, how to create our own, and how to call our own modules within Jupyter Notebook. Understanding how modules work will give you the ability to work with frameworks in Python. Flask, for example, uses a lot of different modules, as each module serves a specific purpose. When you need to keep your project organized, modules are the answer

233

Chapter 9

Advanced Topics II. Complexity

Thursday. Understanding Algorithmic Complexity Throughout this book, we’ve been learning by doing. At the beginning, I spoke about how we wouldn’t go much into theory, but rather we would learn by building projects together and coding along. Today’s focus is primarily on the theory of programming and algorithms. If there is a theory in programming that you should understand, it should be Big O Notation. To follow along with this lesson, let’s continue from our previous notebook file “Week_09” and simply add a markdown cell at the bottom that says, “Understanding Algorithmic Complexity. ”

What Is Big O Notation? As a software engineer, you’ll often need to estimate the amount of time a program may take to execute. In order to give a proper estimate, you must know the time complexity of the program. This is where algorithmic complexity comes in to play, otherwise known as Big O Notation. It is the concept to describe how long an algorithm or program takes to execute. Take a list, for example. As the number of items within the list grows, so does the amount of time it takes to iterate over the list. This is known as O[n], where n represents the number of operations. It’s called Big O Notation because you put a “Big O” in front of the number of operations. Big O establishes a worst-case scenario runtime. Even if you search through a list of 100 items and find what you’re looking for on the first try, this would still be considered O[100] because it could possibly take up to 100 operations. The most efficient Big O Notation is O[1], also known as constant time. It means that no matter how many items or steps are required, it will always take the same amount of time and generally occurs instantly. If we took the same list of 100 items and accessed an index directly, this would be known as O[1]. We would retrieve the value in that index immediately without needing to iterate over the list. One of the least efficient time complexities is O[n∗∗2]. This is a representation of a double loop. Our Bubble Sort algorithm that we wrote uses a double for loop and is known as one of the less efficient sorting algorithms in programming; however, it is simple to understand, so it makes for a good introduction into algorithms. We’ll see later today how Bubble Sort compares to another algorithm that is designed to be much more efficient. 234

Chapter 9

Advanced Topics II. Complexity

When you compare a simple search that iterates through each element of a list to an efficient algorithm like Binary Search, you begin to see that they don’t grow at the same rate over time. Take Table 9-1 that illustrates the amount of time to search for a given item

Table 9-1. Big O Notation growth rate comparison1 Number of Elements

Simple Search

Binary Search

The runtime in Big O Notation

O[n]

O[log n]

10 ms

3 ms

100

100 ms

7 ms

10,000

10 sec

14 ms

1,000,000,000

11 days

32 ms

We can clearly see that efficient algorithms can help to improve our programs speed. Therefore, it’s important to keep efficiency and time complexity in mind when writing your code. The picture in Figure 9-2 depicts the complexity of the number of operations over the number of elements

https. //guide. freecodecamp. org/computer-science/notation/big-o-notation/

235

Chapter 9

Advanced Topics II. Complexity

Figure 9-2. Big O Notation complexity over time chart Not all of Big O Notation is covered here, so be sure to do some further research if you’d like to understand these concepts further. This is simply an introduction into what Big O is and why it is important when writing our programs

Hash Tables When we originally covered dictionaries, we went over hashing very briefly. Now that we’ve covered Big O Notation, understanding hash tables and why they’re important is much easier. Dictionaries can be accessed in O[1] complexity because of how they are stored in memory. They use hash tables to store the key-value pairs. Before we cover hash tables though, let’s have a quick refresher on the hash function and how to use it. >>> a, c = 'bo', "bob" >>> b = a >>> print[hash[a], hash[b], hash[c]] 236

Chapter 9

Advanced Topics II. Complexity

From the preceding code, we would get the same values for a and b and a separate value for the hash of c. Hash functions are used to create an integer representation of a given value. In this case the integer for the string “bo” and the variables a and b are the same; however, “bob” and the c variable are completely different because they have a different value. When dictionaries store key-value pairs into memory, they use this concept. A hash table is used to store a hash, a key, and a value. The hash stored is used for when you need to retrieve a given value by the key. Take Table 9-2, for instance. There are three key-value pairs in place, all with different hash values. When you want to acces the value for name, you would write. >>> person[ "name" ] What happens is Python hashes the string “name” and looks for the hash value rather than the key itself. You can think of this like retrieving an item within a list by its index. This is much more efficient as you can retrieve values based on hashes almost instantly at O[1] time

Table 9-2. Logical representation of Python hash table Hash

Key

Value

2839702572

Name

John Smith

8267348712

Age

-2398350273

Language

Python

Dictionaries are helpful data collections for not only keeping information connected but also improving efficiency. Keep this in mind when you’re trying to answer programming questions or making a program faster. Like the information on Big O Notation, this is simply an introduction into hash tables. If you’d like to learn more, be sure to look it up using Google, Quora, etc

237

Chapter 9

Advanced Topics II. Complexity

Dictionaries vs. Lists To understand the true power of a hash table and Python dictionaries, let’s compare it against a list. We’ll write a conditional statement to have Python check for a given item within a dictionary and list, and we’ll time how long each one takes. We’re going to separate the code into two cells. The first cell will generate the dictionary and list with 10 million items. # creating data collections to test for time complexity import time d = { } # generate fake dictionary for i in range[10000000]. d[ i ] = "value" big_list = [ x for x in range[10000000] ] # generate fake list Go ahead and run the cell. Nothing will happen yet. We’ve simply made the variables within this cell so that we don’t have to re-create them, as it takes a couple seconds depending on your computer. In the following cell, we’re going to keep a timer on how long each data collection takes to find the last element. We’ll use the time module in order to track the start and end time. 1. # retrieving information and tracking time to see which is faster 3. start_time = time. time[ ] # tracking time for dictionary 5. if 9999999 in d. 6. print["Found in dictionary"] 8. end_time = time. time[ ] – start_time 10. print[ "Elapsed time for dictionary. { }". format[end_time] ] 12. start_time = time. time[ ] # tracking time for list 14. if 9999999 in big_list. 15. print["Found in list"] 17. end_time = time. time[ ] – start_time 19. print[ "Elapsed time for list. { }". format[end_time] ] Go ahead and run the cell. On lines 3 and 12, we access the current time in UTC format. After checking our conditions, we get the current time in UTC format again; however, we subtract the start time from it to get the number of seconds the entire 238

Chapter 9

Advanced Topics II. Complexity

execution took. You’ll notice there’s a large difference between the two times. The list will usually take between 1 and 1. 5 seconds, whereas the dictionary is almost instant every time. Now this doesn’t seem like that big of a difference, but what if you needed to search for 1000 items. Using a list now becomes a problem, as a dictionary would continue to do it instantly, but the list would take much longer

Note The time module gets time in UTC [universal time] unless otherwise stated. UTC began on January 1, 1970. The number you see when you output time. time[] is the number of seconds since that day at 12. 00 AM

Battle of the Algorithms One of the most obvious ways to test time complexity is to run two algorithms against each other. This will allow us to really see the power behind an efficient algorithm. We’re going to test Bubble Sort against another sorting algorithm called Insertion Sort. Although Insertion Sort isn’t the most efficient algorithm when sorting, we’ll find out that it’s still much more powerful than Bubble Sort. Let’s go ahead and write out the two sorting algorithms within the first cell. 1. # testing bubble sort vs. insertion sort 3. def bubbleSort[aList]. 4. for i in range[ len[aList] ]. 5. switched = False 6. for j in range[ len[aList] – 1 ]. 7. if aList[ j ] > aList[ j + 1 ]. 8. aList[ j ], aList[ j + 1 ] = aList[ j + 1 ], aList[ j ] 9. switched = True 10. if switched == False. 11. break 12. return aList 14. def insertionSort[aList]. 15. for i in range[ 1, len[aList] ]. 16. if aList[ i ] < aList[ i – 1 ]. 17. for j in range[ i, 0, -1 ]. 239

Chapter 9

Advanced Topics II. Complexity

18. if aList[ j ] < aList[ j – 1 ]. 19. aList[ j ], aList[ j + 1 ] = aList [ j + 1 ], aList[ j ] 20. else. 21. break 22. return aList Go ahead and run the cell. Now that we’ve defined the two functions we need to call, let’s set up some random data to be sorted and set up a timer like we did in the previous section. 1. 2. 4. 6. 7. 8. 9. 11. 12. 13. 14

# calling bubble sort and insertino sort to test time complexity from random import randint nums = [ randint[0, 100] for x in range[5000] ] start_time = time. time[ ] # tracking time bubble sort bubbleSort[nums] end_time = time. time[ ] – start_time print[ "Elapsed time for Bubble Sort. { }". format[end_time] ] start_time = time. time[ ] # tracking time insertion sort insertionSort[nums] end_time = time. time[ ] – start_time print[ "Elapsed time for Insertion Sort. { }". format[end_time] ]

Go ahead and run the cell. It’s not even a contest. Insertion Sort is a more efficient algorithm than its counterpart. Although both use the concept of a double for loop, Bubble Sort’s steps are much more inefficient because it starts at the front of the list each time. It’s always important to keep time complexity in mind when designing your program and algorithms. If you’re ever unsure what’s best to use, try testing it like we have here

THURSDAY EXERCISES 1. Merge Sort. Do some research, and try to find out the “Big O” representation for a Merge Sort algorithm. 2. Binary Search. What is the max number of guesses it would take for a Binary Search to find a number within a list of 10 million numbers? 240

Chapter 9

Advanced Topics II. Complexity

Although today was more about theory than any other part of this book, it’s one of the most important aspects of programming. Big O Notation helps us to understand the efficiency of our programs and algorithms. It’s always important to understand why we use certain data collections like dictionaries or lists. When efficiency is important, dictionaries can be implemented to improve a program. This is another reason why we use dictionaries for caching

Friday. Interview Prep If you’re looking for a new career or job as a Python developer, then all these lessons would be for naught if you can’t pass the interview process. For this Friday, we’re going to cover the process of a general software development interview. We’ll cover each stage, what to do before and after the interview, whiteboarding, answering general and technical questions, and how to contour your resumes and profiles. This lesson is meant to be helpful for those either struggling on the interview process or those of you who have never had a formal software development interview. If you have no interest in this section, and wish to continue, use today as a break from this book’s schedule

Developer Interview Process The interview process for a developer role can be broken down into many different stages. In the following, you’ll find the main stages that many companies in the industry practice. Keep in mind that this is a general interview process and not every company will follow these to a tee. Use this section as more of a guide on what to possibly expect. •

Stage 1 ––

Basic questions about yourself along with past work experience. The first step will usually be a phone call with a 3rd party recruiter, internal recruiter, HR, or talent acquisition of the company. During the first step of the interview process, the interviewer is trying to gauge if you are the correct fit for the role. They are looking for you to mention the “Buzzwords” along with providing 241

Chapter 9

Advanced Topics II. Complexity

information on why you are a good fit for the position. You want to relate yourself to the position. Be sure to talk about your experience using the languages and technologies they’re looking for. The interviewer is looking for you to meet half of the requirements to make yourself a good match. No one will ever know everything, but it is good to show them what you know and your willingness to learn

Note Buzzwords are keywords that the position is looking for. For example, a back-end position using Python would expect to hear words like API, JSON, Python, Flask, Django, Jinja, Relational Databases, PostgreSQL, etc. •

Stage 2 ––

•

Stage 3 ––

242

If you’ve made it past the phone screen, you’ll usually be asked to come in for an in-person interview. This stage is generally where you meet other developers that currently work at the company. Although they’ll ask you interview questions, this stage is generally for the employees to see if they would like to work with you and get to know you on a more personal level. Generally, you’ll interview with small groups of employees at a time. You’ll have about two to five of these sessions that will last around 10–15 minutes each. Before hiring an individual, these groups will generally get together to discuss potential candidates for the next stage. During this stage, be sure to properly introduce yourself and shake each person’s hand. Get to know each employee, and try to relate with them on a personal level

This is the technical round. In this stage, questions will be asked to assess the developer’s skills and abilities. Generally, there will be a whiteboarding question, a couple technical questions on paper, and a brain teaser. This stage is generally conducted with the hiring manager, or team manager that you’ll be working with. When asked a question, make sure you understand it clearly. You are more than welcome to ask as many questions as you need to

Chapter 9

Advanced Topics II. Complexity

clearly understand the problem before answering the question. If you do not know the answer to the question, let the interviewer know that you have not worked with that concept or do not see the problem. The interviewer during this stage will know if you have no idea what you’re talking about so don’t try and make something up. They’ll be more impressed with your honesty and try to guide you through the problem. During this stage, they don’t care if you’re right or wrong. They’re more interested in how you think and how well you can problem-solve. •

Stage 4 ––

At this point, you’re generally sitting with the hiring manager or an HR personnel. In this stage, you can ask questions about the company, as well as the job role. If you’ve made it this far, the company has seen value in you as a potential employee. Usually, this is where contract negotiations and salary conversations occur. At the end of the interview, always have questions ready to ask and lots of them. If you have no questions, it’s generally a sign of not being prepared or laziness

What to Do Before the Interview In almost everything that you do in life, you can never be too prepared. The same goes for interviewing. The following are tips for what you should do before your interview. •

Research ––

•

Be sure to research the company you’re interviewing for. Don’t just understand what products they create, or services they offer, but know what charities they support, the companies they partner with, etc. It shows that you’re involved and care about the companies’ well-being. A little goes a long way

Be Prepared ––

Put together a folder or portfolio of that includes your resume, a pad of paper for taking notes during the interview, examples of work, etc. 243

Chapter 9

•

Advanced Topics II. Complexity

Resume ––

Always print on resume on higher quality paper

––

Contour your resume to the job you’re interviewing for. For example, for back-end roles, mention Python, SQL, databaserelated technologies, etc

––

Keep your resume to a single page

––

Don’t add any fluff

––

Keep it organized with sections like experience, skills, and education

––

Think of your resume as a 30-second elevator pitch

––

Often, it helps to have a designer overlook your resume. Some sites will do this for a small fee but help to make your resume look more professional and organized

Portfolio Web Site ––

•

Github ––

Almost every hiring agency and company will look to your Github to see the projects you’ve worked on

––

It’s best to have complete projects on your portfolio as well. One major project will always stand out better than 10 minor projects

––

Include your Github account in your resume, portfolio web site, and e-mails

LinkedIn ––

244

Not all developers have personal web sites, but it certainly looks bad when you don’t. Imagine going to a dentist that has no teeth. View yourself as the product that you’re trying to sell to companies, you should have a web site that shows your skills and allows others to contact you

Most recruiters and companies are on LinkedIn for one reason, and that’s to look for potential candidates for a job posting

Chapter 9

•

––

Make sure your profile is up to date with all relative information and projects that you’ve worked on

––

Your profile picture should be professional. You don’t need to be in a suit and tie, but it’s best not to have a picture of you on a beach

––

Look at this web site as your professional networking service

––

Post often with information from the field you want to work in. The more you post, the more apt a recruiter is to recognize you

Social Media ––

•

Advanced Topics II. Complexity

Make it private or keep it clean. You better believe companies will look at your posts for a way to understand who you are, and if they don’t like what they see, you won’t be getting a call back

Apply Directly ––

It always looks more professional to send in an application directly to the company. Often, you’ll find a job you like on Indeed or ZipRecruiter; however, these companies get flooded with applications every day on these sites, and they generally have algorithms to eliminate most candidates. Sending a direct e-mail shows that you put time and effort into directly contacting the company

General Questions The following is a list of general nontechnical questions, followed by an example of a good answer. These questions were selected because they are usually asked and answered improperly. •

What salary are you looking for? ––

“I don’t have an exact number right now. I’d like to do some more research on what other companies are offering for a similar position. What do you pay your employees on average for this position?”

––

Never state a number when they ask, this provides leverage for them during any negotiation process. 245

Chapter 9

•

246

Advanced Topics II. Complexity

––

Counter their question with another question

––

If they continue to ask you for a number, simply state the same response

Where do you see yourself in five years? ––

“I’m more so focused on my skills over the next five years. I know that focusing on continuing my education and improvement of myself will lead me to where I need to be. ”

––

Focusing on improving your skills shows compassion

Why did you want to be a software developer? ––

“I’ve always been intrigued by being able to build something out of nothing, and I’ve always enjoyed a challenge. When you’re able to solve problems and build applications, it’s a wonderful feeling. ”

––

Show the passion that you have as a developer; it will always come off as a strength

––

Never mention it’s about money, even if it is

Why are you changing careers? ––

“It felt like I wasn’t being challenged enough in my previous career and I’ve always been interested in programming and the thrill that comes with building applications that improve people’s lives. ”

––

Like the previous question, show the passion and drive that you have for this career

––

Explaining that you like to be challenged shows your not lazy

––

Never mention it’s about money, even if it is

Why do you want to work here? ––

“The applications that you build here help so many users around the world, and I’d love to be a part of that. ”

––

Talk about the applications or charities that the company works with. It shows that you have passion, work well in teams, and that your driven

Chapter 9

•

Advanced Topics II. Complexity

––

Mentioning the culture of the company would be a great answer as well

––

Do not mention salary, benefits, or even worse… have no answer

Tell me about a tough software problem and how you solved it. ––

“I was working on a project where I was assigned to implement the Steam API into the application. Unfortunately, the API wouldn’t connect properly. Using the debugger, I set break points at the import and function call locations. After realizing that they weren’t being hit at all, I figured it must be an issue with connecting. Having tried several import variations, and reading through the documentation, I decided to set up the application to close when the function was hit. When I ran the program the next time, it closed instantly. Realizing that the function is being called, but the application isn’t running properly, I figured it had to be an import issue. It wasn’t until I tested the API in a more up-to-date application that the problem was due to the code being written in version 2. 2, when the API required version 3. 6. In order to connect the API, I had to manually import the library through a mapper function that could translate the code between versions. After realizing that the mapper worked, I was able to implement the libraries that the Steam API included in its SDK. ”

––

Go as in depth as you can with the problem. They want to know every little detail that caused the issue, how you fixed the problem, and all the ideas you had in trying to solve the problem. Although the preceding answer may not have made much sense to you right now, it shows the problem, what I did to try and find the issue, as well as how I came up with a solution once I found the problem

247

Chapter 9

Advanced Topics II. Complexity

Whiteboarding and Technical Questions This section is a list of tips that you should consider using during the third stage of the interview process for both whiteboarding and technical questions. •

Take Your Time ––

•

248

There’s absolutely no rush to solve a problem. Think through a proper solution first before answering the question. Often, you’ll think of two or three different solutions given time

Speak Out Loud ––

Always talk through your thought process. It makes the interviewer feel more comfortable so that you’re both not sitting in a quiet room while you think

––

It shows the interviewer your ability to problem-solve

––

Even if you don’t give the correct answer, they can at least understand where you went wrong and offer some guidance

Steps > Syntax ––

When whiteboarding, you’ll need to write out a function or some lines of code on the board in front of the interviewer. The most important thing to remember is that your thought process is more important than your actual code

––

You can have syntactical bugs on a whiteboard and still pass the interview; however, having an incorrect algorithm or set of steps will cause you to fail

Ask Questions ––

If you’re unsure, ask questions. It’s perfectly fine to ask questions when trying to solve a problem

––

Keep in mind the questions you ask matter though. There’s a big difference in asking what a sort method does, compared to what type of sort method would you like me to use

Chapter 9

•

Advanced Topics II. Complexity

Algorithmic Complexity ––

Always keep in mind the complexity of an algorithm. You’ll generally be asked after you write your code if there is a way to improve the performance of it even further

––

Know the Big O Notation category of the algorithm you just wrote

––

Think about what data types or collections would work best for your scenario

Be Honest ––

If you don’t know an answer, absolutely do not try and talk your way through it. The interviewer during this stage is a professional developer and can pick apart anything that doesn’t make sense

––

Being honest and saying you’re not sure but are willing to learn the material will always prove to be a better method of answering questions you don’t know how to solve

End of Interview Questions You never want to be empty handed at the end of an interview when they ask if you have any questions. It’s usually good practice to take notes during an interview and write down questions as you think of them. In the following, you’ll find a list of questions that you should consider asking. •

How is the commute?

•

Is parking free?

•

Do you hold social events?

•

If I wanted to further my career skills, do you guys offer any services or tuition reimbursement?

•

What kind of benefits do you offer?

•

What is the company culture like?

•

How many people will be working on the team with me?

•

Will there be mentoring involved?

•

Can you tell me more about the day-to-day responsibilities of this role? 249

Chapter 9

Advanced Topics II. Complexity

•

What do you like best about working for this company?

•

What is the typical career path within this company for someone in this role?

•

What are the next steps in the interview process?

•

What might I expect in a typical day?

•

What charities does this company support?

•

Are there any company activities, like sports teams?

What to Do After the Interview Even if you pass the first three stages, you can still fail miserably if you don’t execute the proper steps following the interview. In the following, you’ll find examples of what you should do once the interview process is complete. •

Follow Up ––

•

Critique Yourself ––

•

Understand your own mistakes. Don’t take it personal; the only way you can get better is by understanding and self-reflecting

Continue Building ––

Always be working on projects and trying to improve your portfolio

––

Stay up to date with the latest libraries, languages, and technologies

––

Update your resume and portfolio often

Adventure Out ––

250

Always, always, always send an e-mail to the interviewer immediately, thanking them for their time. It shows respect and is a courteous gesture

Go out to local networking events in your area. This is where you’ll meet most of your connections. It’s always easier to land a job when you know someone who works in the company

Chapter 9

–– •

Advanced Topics II. Complexity

Events like code alongs, or hackathons, are a great way to meet other developers looking to work together

Rejection ––

It happens, you won’t always get the job. If it does occur, be sure to ask the interviewer in a courteous manner as to why you didn’t get the job. Don’t take it personally; instead use this information to become a better developer and improve

Today was all about understanding the interview process and how you can improve your interviewing skills. Even the greatest programmers can be terrible interviewers. It takes a lot of hard work and focus to land the proper job, and even then, it may not work out. The best advice is to just continue to improve your skills and network with other software developers

Weekly Summary This week was the second portion of the more advanced Python concepts. Much of the lessons taught this week were important for not only interviewing but for improving the performance of your projects. Iterators and generators are a type of object that can be used to create better looping structures and algorithms. Being able to use decorators will help to improve function capabilities and are widely used within frameworks like Flask or Django. Modules allow us to use other developer’s code by importing the functions or entire files into our program. Being able to write our own modules allows us to reduce the amount of code in each file. You generally want to stay as organized as possible because it makes the project easier to read, maintain, and fix. If there’s one topic you need to understand from this week, however, it would be Big O Notation. Understanding how Big O works can help in job interviews and knowing how to improve the speed of an application. There are more advanced topics to cover on Python and programming in general, but these last two weeks will give you enough to start building your own projects and even move on to learning about frameworks and larger-scale applications using databases

251

Chapter 9

Advanced Topics II. Complexity

Challenge Question Solution We were able to review the exact answer to this question during the lesson from Thursday. It was easy to see that dictionaries are clearly the more efficient way to store and retrieve data. It’s always important to keep in mind the proper data structures to use when working with large sets of data. You can be sure that similar questions will be asked in an interview process

W eekly Challenges To test out your skills, try these challenges. 1. Understanding the Market. Go on to a job application web site like Indeed or Monster, and look up potential jobs that you’re interested in. Make notes of the qualifications and technologies they’re looking for. After looking at several job descriptions, what are the top three technologies? These should be your focus going forward. 2. Shopping Cart Module. Take the code from our Shopping Cart program that we wrote a few weeks back, and put it into a module. In Jupyter Notebook, run the module, and get the program to work properly. 3. Enhanced Shopping Cart. Add a new feature into the program that allows the user to save the cart. Upon running the program, the saved cart should load. The method should be written within the module. Hint. Use a CSV or text file. 4. Code Wars. Make an account on www. codewars. com and try to solve some problems. Code Wars has been used for interview practice problems, improving your algorithm and problemsolving skills, and much more. It will help to increase the skills taught in this book. Try to solve a problem a day, and you’ll notice your Python programming skills will improve

252

CHAPTER 10

Introduction to Data Analysis Up to this point, we’ve covered enough Python basics and programming concepts to move on toward bigger and better things. This week will encompass a full introduction into the data analysis libraries that Python has to offer. We won’t go in depth like other books that focus on this subject; instead we’ll cover enough to get you well on your way to analyzing and parsing information. We’ll learn about the Pandas library and how to work with tabular data structures, web scraping with BeautifulSoup and understanding how to parse data, as well as data visualization libraries like matplotlib. At the end of the week, we’ll use all these libraries together to create a small project that scrapes and analyzes web sites. Overview •

Working with Anaconda environments and sending requests

•

Learning how to analyze tabular data structures with Pandas

•

Understanding how to present data using matplotlib

•

Using the BeautifulSoup library to scrape the Web for data

•

Creating a web site analysis tool

253

Chapter 10

Introduction to Data Analysis

CHALLENGE QUESTION Imagine you’re a data analyst and you’ve just been handed a set of data that shows the number of accidents for all drivers, their ages, and the size of their engines. You need to figure out a way to display this information so that it tells a story. Normally you would create a graph with x, y, z coordinates; however, that can become complicated, and you don’t have time for that. How would you render the information so that it’s still considered 3-dimensional, but you can only use the x and y axis?

Monday. Virtual Environments and Requests Module Today we’ll be learning all about virtual environments, why we need them and how to use them. They’re necessary for what we need to do this week, which is downloading and importing a few libraries to work with. We’ll also get into the requests module and cover APIs briefly. For today’s lesson, we won’t be starting out in Jupyter Notebook; instead open the terminal and cd into the “python_bootcamp” folder if you haven’t already. If you have the terminal running Jupyter Notebook, be sure to stop it, as we need to write some commands in the terminal

What Are Virtual Environments? Python virtual environments are essentially a tool that allows you to keep project dependencies in a separate space from other projects. Most projects in Python need to use modules that are not included by default with Python. Now, you could simply download the modules [or libraries] into your Python folder to use; however, that could cause some issues down the road. Let’s say you’re working on two separate projects, where the first one uses Python version 2. 7 and the second project uses Python version 3. 5. If you try and use the same syntax for both, you’ll run into several issues. Instead, you would create two separate virtual environments, one for each project. This way both projects can run properly using the correct dependencies because of the personalized virtual environment

254

Chapter 10

Introduction to Data Analysis

Note When creating a virtual environment, a folder called “venv ” will appear. This is where all the libraries that you download are saved. Simply put, a virtual environment is not much more than a folder that stores other files. As an analogy to understand virtual environments, first picture our own planet. Now think of it as an environment filled with grass, sun, clouds, air, etc. In the case of programming, Python would be like the planet, and the grass, sun, clouds, and air would be like libraries that you need to include in the environment. As Python does not come included with them, we would create a virtual environment to store these libraries so that we may import them into our project when needed. If you think of Mars, that would be another project, with a separate virtual environment specifically made for that program. Virtual environments can often be a tough concept to grasp for anyone seeing it for the first time, so here’s another analogy. Imagine you’ve planned two vacations, one to the beach and the other to go skiing. Rather than using the same suitcase filled with mixed clothes, you’ve decided to pack two separate suitcases. The one for the beach will include a bathing suit, sunglasses, and flip-flops. The other suitcase will include a jacket, skiis, and boots. In the following, you can find the relationships within this analogy. •

Vacations ➤ Projects

•

Suitcases ➤ Virtual Environments

•

Clothes and Accessories ➤ Project Dependencies/Files

Note Remember from the first chapter, when working in terminal, you’ll see the $ next to the commands that we enter. For the next few sections, we’ll be working inside of terminal

255

Chapter 10

Introduction to Data Analysis

What Is Pip? Pip is the standard package manager for Python. Anytime you need to download, uninstall, or manage a library or module to use within your project, you use pip. It has been included in all installations of Python since v3. 4. To check your version of pip, write the following in terminal. $ pip --version Feel free to visit the Python Package Index [PyPI] to view all the possible libraries that you’re able to download. You can use any of them in your future projects. For today, we’ll learn how to install and use the requests module, but first, let’s create and activate a virtual environment

Creating a Virtual Environment One of the big reasons Anaconda is such a wonderful tool is because of its ability to organize virtual environments for us. We’re going to use it to create our first virtual environment. While in terminal, type in the following command. $ conda create --name data_analysis python=3. 7 Go ahead and run the command. It’ll then ask you if you’d like to proceed by typing in “y” or “n”, simply type “y” for yes and hit enter. A folder will be created within the Anaconda directory in our program files. The folder will be given the name of “data_analysis. ” We’ve just created our own virtual environment using Python version 3. 7. In order to use it, we must activate it. If you wanted to use Python’s default virtual environment system, you can use the keyword “virtualvenv. ” Be sure to look that up if you’re interested. We will use Conda’s environments for the ease of use throughout this chapter

Note You can create a conda environment from anywhere; you do not need to be cd’ed into a specific folder

256

Chapter 10

Introduction to Data Analysis

Activating the Virtual Environment The second step in using a virtual environment is activating it. Activating an environment allows the computer to execute our scripts from a separate executable. By default, we use the Python executable file stored in our program’s directory. We can see the PATH of the executable by entering the following commands into the terminal. We need to activate the Python shell first. $ python Now we can view the PATH by typing in the following lines. >>> import os >>> import sys >>> os. path. dirname[sys. executable] You’ll notice that the PATH is your default folder where Python was originally installed. Go ahead and exit the Python shell once you’re done. We’ll come back to these same commands at the end of this section to see how the PATH has changed once the environment is activated. Once you create the environment, you don’t need to create it again; you can simply activate it anytime you need to use it. Before you’re able to download libraries into the environment, you must first activate it. Depending on your operating system, write the following command in terminal. For Windows. $ activate data_analysis After activating the environment, you’ll see the name appear within parenthesis on the left side of the terminal. It will be shown like the following. >>> [data_analysis] C. \Users. For Mac/Linux. $ source activate data_analysis 257

Chapter 10

Introduction to Data Analysis

Like Windows, after activating the environment, you’ll see the name to the left of your directory. >>> [data_analysis] ~/Desktop. If you can see the name on the side, you’ve successfully activated the environment. Before we move on, let’s see where our executable is now by running the same commands in the Python shell from the beginning of this section, to view the PATH of the executable. >>> import os >>> import sys >>> os. path. dirname[sys. executable] After running those same lines, you’ll notice that a different PATH has been output. This is the executable of our Conda environment that will be running our scripts. Now we can begin to install any libraries or packages we may need to work with

Installing Packages To install packages into the virtual environment, we’ll use pip. The syntax is always the same to install any package. It’s the keywords pip install, followed by the package name. In our case, we’ll be working with the requests package today. Let’s write the following command. $ pip install requests Go ahead and run the command. We’ve just installed the requests module into our environment to work with. To be sure that it installed properly, write the following command. $ conda list This command lists out all the packages that are installed within this environment. You’ll be able to see the requests package that we just downloaded as well as the other packages that were downloaded initially when we created the environment. 258

Chapter 10

Introduction to Data Analysis

APIs and the Requests Module The requests module allows us to make HTTP requests using Python. It is the standard library for making API calls and requesting information from outside resources

Note If you’re unfamiliar with HTTP requests, I suggest checking out the w3schools1 resource for more information, as this book is not designed to cover networking. An application programming interface [API] is a set of functions and procedures that allow applications to access the features or data of an operating system, application, or other service. In a simpler description, APIs allow us to interact with web pages and software designed by other developers. Imagine you need some data on housing prices. Rather than collecting all that information yourself, you could use the resources that major companies like Zillow and Trulia have put together. In order to access that information, you need to call their API, which will return the data that you need. APIs make a developer’s life easier because we can use data or tools created by other companies within our projects

Using the Requests Module Now that we’ve created and activated our environment and installed the package that we’ll be working with for the rest of the day, we can open Jupyter Notebook

Note If you do not have the environment activated or the requests module installed, then you will receive errors. Be sure to activate the environment, and check that the requests module is installed. To follow along with the content for the rest of the lesson, open up Jupyter Notebook from our “python_bootcamp” folder in terminal. Once it’s open, create a new file, and rename it to “Week_10. ” Next, make the first cell markdown that has a header saying. “Virtual Environments and Requests Module. ” We’ll begin working underneath that cell

www. w3schools. com/tags/ref_httpmethods. asp

259

Chapter 10

Introduction to Data Analysis

S ending a Request For this lesson, we’ll be requesting information from an API created by Github. Generally, APIs require a key in order to use their service; however, we’ll be using one that doesn’t require an API key. To begin, we must send a request to a specific URL, which will send a response back to us. That response will include data that we’ll be able to parse through. Write the following. 1. 3. 5. 7. 8

# sending a request and logging the response code import requests r = requests. get["https. //api. github. com/users/Connor-SM"] print[ r ] print[ type[ r ]]

Go ahead and run the cell. In order to use requests, you must import it, which is what we do on line 3. Next, we use the get[] method within the requests object in order to request information from the given URL that we pass in. The data we expect to get back will be the profile information for my Github account. Feel free to replace “Connor-SM” in the URL with your own profile username. The first print statement will output a response code. You should get back “”; if you don’t, be sure to check your Internet connection. This output is letting us know that we were successful in requesting information from the Github URL. For a list of response codes and what they mean, be sure to visit w3schools2 resource. The second print statement will output the type of our variable, which is a request object. All request objects come preloaded with default methods and attributes that we can access. This will allow us to work with the data that we received

Accessing the Response Content In order to access the data that we get back in the response, we need to access the content attribute within our requests object. # accessing the content that we requested from the URL data = r. content print[data]

www. w3schools. com/tags/ref_httpmessages. asp

260

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. We’ll get a byte string output with lots of brackets and information in a way that’s difficult to read. Responses from APIs are generally sent in string format, as strings are much lighter data types than objects. The actual response that we get back is in JSON formatting. JavaScript Object Notation [JSON] format is the equivalent of a Python dictionary and is the default format to send data via a request. The next step is to convert the data from a JSON formatted string into a dictionary that we can parse

Converting the Response Luckily for us, the requests object comes with a built-in JSON conversion method called json[]. After we convert the response to a dictionary, let’s output all the key-value pairs. # converting data from JSON into a Python dictionary and outputting all key-value pairs data = r. json[ ] # converting the data from a string to a dictionary for k, v in data. items[ ]. print["Key. { } \t Value. { }". format[k, v]] print[data["name"]] # accessing data directly Go ahead and run the cell. All the information is now easy to read and access, as seen through the for loop implementation and the simple print statement

Passing Parameters Most API calls that you perform will require extra information like parameters or headers. This information is taken in by the API and used to perform a specific task. Let’s perform a call this time while passing parameters in the URL to search for Python- specific repositories on Github. # outputting specific key-value pairs from data r = requests. get["https. //api. github. com/search/repositories?q=language. python"] data = r. json[ ] print[data["total_count"]] # output the total number of repositories that use python 261

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. There are a couple different ways that we can send parameters through the request. In this case, we’ve written them directly into the URL string itself. You may also define them within the get method like the following. >>> requests. get["https. //api. github. com/search/repositories", >>> params = { 'q' = 'language. python' } ] When sending parameters through the URL, you separate the URL and the parameters with a question mark. To the right of the question mark are a set of key-value pairs that represent the parameters being passed. For our example, the parameter being passed has a key of “q” and a value of “requests+language. python”. The API on Github will take this information and give us back the data on repositories that use Python, because that’s what we asked for in our parameters. Not all APIs require parameters, however, like our first call previously in this lesson. To figure out what is required when calling an API, always read the documentation. Good documentation for APIs is everything and can make your life as a developer much easier

Note To stop running the virtual environment, simply write into the terminal “deactivate. ” You will be asked to activate the environment before each lesson this week

MONDAY EXERCISES 1. Test Environment. Create a new virtual environment called “test. ” When creating it, install Python version 2. 7 instead of the current version. After it’s completed, make sure it installed the proper version of Python by checking the list. 2. JavaScript Repositories. Using the requests module and the Github API link in our last lesson, figure out how many repositories on Github use JavaScript

262

Chapter 10

Introduction to Data Analysis

Today was an important introduction into data analysis. Not only did we cover how to use virtual environments and why, but we also went over the requests module with a brief introduction into APIs. When using any library for the rest of the week, we’ll need to activate our data_analysis virtual environment. At the end of the week, we’ll cover web scraping, which requires us to use the requests module

Tuesday. Pandas When you need to work with data, Pandas is the ultimate tool. It’s essentially Excel on steroids. If you’re familiar with the SQL language, this will come easier to you, as Pandas is a mix of Python and SQL. By the end of the day, you’ll be able to analyze and work with tabular data in a more efficient way than other traditional methods. Like how yesterday’s lesson began, we need to install the Pandas library into our virtual environment. To follow along with today’s lesson, cd into the “python_bootcamp” folder, and activate the environment. We’ll begin today within the terminal

Note If you can’t remember how to activate the environment, go back to yesterday’s lesson

What Is Pandas? Pandas is a flexible data analysis library built within the C language, which is excellent for working with tabular data. It is currently the de facto standard for Python-based data analysis, and fluency in Pandas will do wonders for your productivity and frankly your resume. It is one of the fastest ways of getting from zero to answer. Having been written in C, it has increased speed when performing calculations. The Pandas module is a high performance, highly efficient, and high-level data analysis library. It allows us to work with large sets of data called DataFrames

Note NumPy is a fundamental package for scientific computing in Python. Built from the C language, it uses multidimensional arrays and can perform calculations at high-rate speeds. 263

Chapter 10

Introduction to Data Analysis

The Pandas library is useful in so many ways that you can do any of the following and more. •

Calculate statistics and answer questions about the data like average, median, max, and min of each column

•

Finding correlations between columns

•

Tracking the distribution of one or more columns

•

Visualizing the data with the help of matplotlib, using plot bars, histograms, etc

•

Cleaning and filtering data, whether it’s missing or incomplete, just by applying a user-defined function [UDF] or built-in function

•

Transforming tabular data into Python to work with

•

Exporting the data into a CSV, other file, or database

•

Feature engineer new columns that can be applied to your analysis

No matter what you need to do with data, Pandas is your end-all-be-all analysis library

Key Terms The following are key terms we’ll be using throughout this section. Be sure to look over them and reference them when necessary

264

•

Series ➤ One-dimensional labeled array capable of holding data of any type

•

DataFrame ➤ Spreadsheet

•

Axis ➤ Column or row, axis = 0 by row; axis = 1 by column

•

Record ➤ A single row

•

dtype ➤ Data type for DataFrame or series object

•

Time Series ➤ Series object that uses time intervals, like tracking weather by the hour

Chapter 10

Introduction to Data Analysis

Installing Pandas To install Pandas, make sure your virtual environment is activated first, then write the following command into the terminal. $ pip install pandas After running the command, it should install a few packages that Pandas requires. If you’d like to check and make sure you downloaded the proper library, just write out the list command

Importing Pandas To follow along with the rest of this lesson, let’s open and continue from our previous notebook file “Week_10” and simply add a markdown cell at the bottom that says, “Pandas. ” Importing Pandas is simple; however, there is an industry standard when you import the library. # importing the pandas library import pandas as pd # industry standard name of pd when importing Go ahead and run the cell. We import Pandas as pd because it’s shorter and easier to reference

Creating a DataFrame The central object of study in Pandas is the DataFrame, which is a tabular data structure with rows and columns like an Excel spreadsheet. You can create a DataFrame from a Python dictionary or a file that has tabular data, like a CSV file. Let’s create our own from a dictionary. 1. # using the from_dict method to convert a dictionary into a Pandas DataFrame 2. import random 265

Chapter 10

Introduction to Data Analysis

4. random. seed[3] # generate same random numbers every time, number used doesn't matter 6. names = [ "Jess", "Jordan", "Sandy", "Ted", "Barney", "Tyler", "Rebecca" ] 7. ages = [ random. randint[18, 35] for x in range[ len[names] ]] 9. people = { "names" . names, "ages" . ages } 11. df = pd. DataFrame. from_dict[people] 12. print[df] Go ahead and run the cell. We import the random module so that we may create random ages for our people on line 7. Using the seed method on line 4 will give us both the same random numbers to work with. You could pass any number as the argument into seed; however, if you use a number other than 3, you’ll get a different output than this book

Note Random numbers aren’t truly random; they follow a specific algorithm to return a number. After we generate a list of names and random ages for each person, we create a dictionary called “people. ” The magic truly happens on line 11, where we use Pandas to create the DataFrame that we’ll be working with. When it’s created, it uses the keys as the column names, and the values match up with the corresponding index, such that names[0] and ages[0] will be a single record. You should output a table that looks like Table 10-1

266

Chapter 10

Introduction to Data Analysis

Table 10-1. DataFrame created from fake data ages

names

Jess

Jordan

Sandy

Ted

Barney

Tyler

Rebecca

Accessing Data There are a few different ways that we can access the data within a DataFrame. You have the option to choose by the column or by the record. Let’s look at how to do both

Indexing by Column Accessing data by a column is the same as accessing data from a dictionary with the key. Within the first set of brackets, you put the column name that you would like to access. If you’d like to access a specific record within that column, you use a second set of brackets with the index. 1. # directly selecting a column in Pandas 2. print[ df["ages"] ] 3. print[ df["ages"][3] ] # select the value of "ages" in the fourth row [0-index based] 5. # print[ df[4] ] doesn't work, 4 is not a column name Go ahead and run the cell. On line 2 we output the entire ages column of data. The second statement allows us to access the value at a specific cell. Be careful though, putting the index number in the first set of brackets will create an error, as the first set is only meant for column names and “4” is not a column. 267

Chapter 10

Introduction to Data Analysis

Indexing by Record When you need to access an entire record, you must use loc. This allows us to specify the record location via the index. Let’s access the entire first record, then the name within that record. # directly selecting a record in Pandas using . loc print[ df. loc[0] ] print[ df. loc[0]["names"] ] # selecting the value at record 0 in the "names" column Go ahead and run the cell. We can see that we’re able to output the entire record. In the case of using loc, you must specify the record index location first, then the column name

Slicing a DataFrame When you want to access a specific number of records, you must slice the DataFrame. Slicing in Pandas works the exact same way as a Python list does, using start, stop, and step within a set of brackets. Let’s access the records from index 2 up to 5. # slicing a DataFrame to grab specific records print[ df[2. 5] ] Go ahead and run the cell. This will output the records at index 2, 3, and 4. Again, be careful when slicing as leaving off the colon would result in trying to access a column name

Built-in Methods These are methods that are frequently used to make your life easier when using Pandas. It is possible to spend a whole week simply exploring the built-in functions supported by DataFrames in Pandas. However, we will simply highlight a few that will be useful, to give you an idea of what’s possible out of the box with Pandas

268

Chapter 10

Introduction to Data Analysis

head[ ] When you work with large sets of data, you’ll often want to view a couple records to get an idea of what you’re looking at. To see the top records in the DataFrame, along with the column names, you use the head[] method. # accessing the top 5 records using . head[ ] df. head[5] Go ahead and run the cell. This will output the top five records. The argument passed into the method is arbitrary and will show as many records as you want from the top

tail[ ] To view a given number of records from the bottom, you would use the tail[] method. # accessing the bottom 3 records using . tail[ ] df. tail[3] Go ahead and run the cell. This will output the bottom three records for us to view

keys[ ] Sometimes you’ll need the column names. Whether you’re making a modular script or analyzing the data you’re working with, using the keys[ ] method will help. # accessing the column headers [keys] using the . keys[ ] method headers = df. keys[ ] print[headers] Go ahead and run the cell. This will output a list of the header names in our DataFrame

269

Chapter 10

Introduction to Data Analysis

. shape The shape of a DataFrame describes the number of records by the number of columns. It’s always important to check the shape to ensure you’re working with the proper amount of data. # checking the shape, which is the number of records and columns print[ df. shape ] Go ahead and run the cell. We’ll get a [7, 2] tuple returned, representing records and columns

describe[ ] The describe method will give you a base analysis for all numerical data. You’ll be able to view min, max, 25%, 50%, mean, etc. , on all columns just by calling this method on the DataFrame. This information is helpful to start your analysis but generally won’t answer those questions you’re looking for. Instead, we can use this method as a guideline of where to start. # checking the general statistics of the DataFrame using . describe[ ], only works on numerical columns df. describe[ ] Go ahead and run the cell. Remember that it’ll only give back information on numerical column types, which is why we only see an output for the ages column

sort_values[ ] When you need to sort a DataFrame based on column information, you use this method. You can pass in one or multiple columns to be sorted by. When passing multiple, you must pass them in as a list of column names, in which the first name will take precedence. # sort based on a given column, but keep the DataFrame in tact using sort_values[ ] df = df. sort_values["ages"] df. head[5] 270

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. In this cell, we’ve re-declared the value of our df variable to our newly sorted DataFrame. This way we can view all the people sorted by age. You may also pass in an argument to sort in descending order

Filtration Let’s look at how to filter DataFrames for information that meets a specific condition

Conditionals Rather than filtering out information, we can create a boolean data type column that represents the condition we’re checking. Let’s take our current DataFrame and write a condition that shows those who are 21 or older and can drink. # using a conditional to create a true/false column to work with can_drink = df["ages"] > 21 print[can_drink] Go ahead and run the cell. When you want to create a column based on a boolean data type, you need to write out the condition based on the entire column. Here, we created a can_drink variable that is storing the entire ages column values. They are true-false values because of our condition that we created. We could potentially use this to create another column to work with

Subsetting When you need to filter out records but retain the information within the DataFrame you need to use a concept called subsetting. We’ll use the same condition as earlier, except this time we’ll use it to filter out records rather than create a true-false representation. # using subsetting to filter out records and keep DataFrame intact df[ df["ages"] > 21 ]

271

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. The output results in only those records whose ages are equal to or above the age of 21. We took the condition from above and wrapped it within brackets while accessing the df variable. Although it may look weird, the syntax representation is the following. >>> dataframe_variable [ conditional statement to filter records with ] You could also write the following for the same exact result. >>> df[ can_drink ] Remember that can_drink is a representation of true-false values, which means that the preceding statement will filter out all records that have the value of false

Column Transformations Rarely, if ever, will the columns in the original raw DataFrame imported from CSV or a database be the ones you need for your analysis. You will spend lots of time constantly transforming columns or groups of columns using general computational operations to produce new ones that are functions of the old ones. Pandas has full support for this and does it efficiently

Generating a New Column with Data To create a new column within a DataFrame, you use the same syntax as if you were adding a new key-value pair into a dictionary. Let’s create a column of fake data that represents how long the people within our DataFrame have been customers with our company. 1. to 2. 4. 6

# generating a new column of fake data for each record in the DataFrame represent customer tenure random. seed[321] tenure = [ random. randint[0, 10] for x in range[ len[df] ]] df["tenure"] = tenure # same as adding a new key-value pair in a dictionary 7. df. head[ ]

272

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. The output will result in a new column created with random numbers for their tenure. We were able to add the column and its values on line 6. In Table 10-2, you’ll find the updated DataFrame, sorted by age

Table 10-2. Adding a new column to the DataFrame ages

names

tenure

Rebecca

Tyler

Sandy

Jess

Ted

Barney

Jordan

a pply[ ] Adding new columns based on current data is known as “feature engineering. ” It makes up a good portion of a data analysts’ job. Often, you won’t be able to answer the questions you have from the data you collect. Instead, you need to create your own data that is useful to answering questions. For this example, let’s try to answer the following question. “What age group does each customer belong to?”. You could look at the persons’ age and assume their age group; however, we want to make it easier than that. In order to answer this question easily, we’ll need to feature engineer a new column that represents each customer’s age group. We can do this by using the apply method on the DataFrame. The apply method takes in each record, applies the function passed, and sets the value returned as the new column data. Let’s check it out. # feature engineering a new column from known data using a UDF def ageGroup[age]. return "Teenager" if age < 21 else "Adult" df["age_group"] = df["ages"]. apply[ageGroup] df. head[10] 273

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. Using the apply method, we’re able to create a new column that easily answers our question. When adding the new age_group column, we applied the ageGroup function based on the values in the ages column. It then iterated over each record in the DataFrame and set the return value of either “Teenager” or “Adult” as the value for the new age_group column. The apply method makes it easy for us to add new data with our own UDF. Take a look at Table 10-3

Table 10-3. Feature engineering an age_group column ages

names

tenure

age_group

Rebecca

Teenager

Tyler

Teenager

Sandy

Adult

Jess

Adult

Ted

Adult

Barney

Adult

Jordan

Adult

Note When you need to apply a value based on multiple columns, you must set the axis = 1

A ggregations The raw data plus transformations is generally only half the story. Your objective is to extract actual insights and actionable conclusions from the data, and that means reducing it from potentially billions of rows to a summary of statistics via aggregation functions. This section assumes some knowledge of SQL and the groupby function. If you’re not familiar with how groupby works in SQL, visit w3schools3 for reference material

www. w3schools. com/sql/sql_groupby. asp

274

Chapter 10

Introduction to Data Analysis

g roupby[ ] In order to condense the information down to a summary of statistics, we’ll need to use the groupby method that Pandas has. Whenever you group information together, you need to use an aggregate function to let the program know how to group the information together. For now, let’s count how many records of each age group there are within our DataFrame. # grouping the records together to count how many records in each group df. groupby["age_group", as_index=False]. count[ ]. head[ ] Go ahead and run the cell. When the information is grouped together using the count method, the program will simply add up the number of records that belong in each category. We’ll have two categories. adult with five records, and teenager with two records. The first argument of our groupby method is the column we want to group on, and the second is to make sure we don’t reset the index to become the age group column. If it were set to True, then the resulting DataFrame would use age_group as the unique identifier for each record

m ean[ ] Instead of counting how many records there are in each category, let’s go ahead and find the averages of each column by using the mean method. We’ll group based on the same column. # grouping the data to see averages of all columns df. groupby["age_group", as_index=False]. mean[ ]. head[ ] Go ahead and run the cell. Using the mean method, we’ll be able to get the averages for all numerical columns. The output should result in a DataFrame that looks like Table 10-4

275

Chapter 10

Introduction to Data Analysis

Table 10-4. Grouping by age_group and averaging data age_group

ages

tenure

Adult

28. 8

5. 4

Teenager

19. 0

5. 0

Just by averaging the information, we can see that adults tend to have a longer tenure. Notice that the names column was dropped. This is because groupby only keeps numerical data, as it wouldn’t be able to average out a string

groupby[ ] with Multiple Columns When you need to group by multiple columns, the arguments must be passed in as a list. The first item in the list will be the main column that the DataFrame is grouped by. In our case, let’s check how many adults have a tenure of five years. # grouping information by their age group, then by their tenure df. groupby[ [ "age_group", "tenure" ], as_index=False]. count[ ]. head[10] Go ahead and run the cell. To answer the question, we needed to group by age_group first, in order to condense the information into adults and teenagers. Next, we needed to group the data further based on the tenure. This would allow us to see how many adults there are for each length of tenure. As we don’t have much data, the answer is only two. We arrive at this conclusion because we used the count method while grouping. All other tenures for each age group have only one customer

Adding a Record To add a record into the DataFrame, you’ll need to access the next index and assign a value as a list structure. In our case, the next index would be 7. Let’s add an identical row that already exists in our DataFrame, so we can see how to remove duplicate information in the next cell. # adding a record to the bottom of the DataFrame df. loc[7] = [ 25, "Jess", 2, "Adult" ] # add a record df. head[10] 276

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. This will add a new record at the bottom with the same data as our record in index 0. You won’t need to add new records too often, but it helps to know how to do it when the time comes

d rop_duplicates[ ] Way too often will you see data with duplicate information, or just duplicate IDs. It’s imperative that you remove all duplicate records as it will skew your data, resulting in incorrect answers. You can remove duplicate records based on a single column or an entire record being identical. In our case, let’s remove duplicates based on similar names, which will remove the record we just added into our DataFrame. # removing duplicates based on same names df = df. drop_duplicates[ subset="names" ] df. head[10] Go ahead and run the cell. This will remove the second record with the name “Jess. ” By passing the column name into the subset parameter, we can remove all duplicates with the same name

Note Omitting the subset argument will remove only duplicate records that have identical values in all columns

P andas Joins Often, you will have to combine data from several different sources to obtain the actual dataset you need for your exploration or modeling. Pandas draws heavily on SQL in its design for joins. This section assumes some knowledge of SQL and SQL joins. If you’re not familiar with how joins work in SQL, visit w3schools4 for reference material

www. w3schools. com/sql/sql_join. asp

277

Chapter 10

Introduction to Data Analysis

Creating a Second DataFrame Let’s create a secondary DataFrame to represent our customers posting ratings about our company. We’ll create ratings for three users so we can see both inner joins and outer joins. # creating another fake DataFrame to work with, having same names and a new ratings column ratings = { "names" . [ "Jess", "Tyler", "Ted" ], "ratings" . [ 10, 9, 6 ] } ratings = df. from_dict[ratings] ratings. head[ ] Go ahead and run the cell. Now that we’ve created a second DataFrame, we can join the two DataFrames together, much like joining two tables together in SQL

I nner Join Anytime you perform a join, you need a unique column to join the data with. In our case, we can use the names column to join the ratings DataFrame with our original DataFrame. Let’s perform an inner join on these two datasets so that we can connect users with their ratings. # performing an inner join with our df and ratings DataFrames based on names, get data that matches matched_ratings = df. merge[ratings, on="names", how="inner"] matched_ratings. head[ ] Go ahead and run the cell. We’ll get an output that looks like Table 10-5

278

Chapter 10

Introduction to Data Analysis

Table 10-5. Joining DataFrames to view customer ratings and ages together ages

names

tenure

age_group

ratings

Tyler

Teenager

Jess

Adult

Ted

Adult

Using the merge method, we were able to perform a join. By specifying the how parameter to “inner,” we were able to return a DataFrame with only those records who posted a rating. We could do a lot more with this data now than before. We could calculate average age of customers who gave us a rating, average rating per age group, etc. Joins will always help to connect separate DataFrames together, which helps especially when working with databases

Outer Join If we want to return all the records, but connect the ratings for people who gave one, we would need to perform an outer join. This would allow us to keep all records from our original DataFrame while adding the ratings column. We need to specify the how parameter to “outer”. # performing an outer join with our df and ratings DataFrames based on names, get all data all_ratings = df. merge[ratings, on="names", how="outer"] all_ratings. head[ ] Go ahead and run the cell. We’ll get a DataFrame of all seven records this time; however, those that didn’t give a rating were given a NaN for a value. This stands for “Not a Number. ” Once we combine this information, we could then find out the average age of those who gave a rating and those who didn’t. From a marketing perspective, this would be helpful to know who the target demographic should be

279

Chapter 10

Introduction to Data Analysis

Dataset Pipeline A dataset pipeline is a specific process in which we take our data and clean it for our model, which will be able to make predictions. This can be a lengthy process if the dataset that you use is unclean. A dataset that is not clean will have duplicates records, null values everywhere, or unfiltered information that leads to incorrect predictions. Here is the general process. 1. Performing Exploratory Analysis •

In this step you want to get to know your data very well. Take notes for what you see at a glance or what you may want to clean or add. You essentially want to get a feel for what your data has to offer. Make note of the number of columns, the data types, outliers, null values, and columns that aren’t necessary. This is generally when you want to plot out each column of data and speculate correlations, non-informational features, etc

2. Data Cleaning •

Improper cleaning can lead to poor predictions and bad datasets. Here, you’ll want to remove unwanted observations like duplicates, fix structural errors like columns that have the same name but are typos, handle missing data, and filter outlier information. This is key for the next step

3. Feature Engineering •

Creating new information that isn’t depicted by the dataset is important. You can use your own expertise if you have knowledge of the subject, and you can isolate data which allows your algorithms to focus more on the important observations. Here you can feature engineer columns into a group, add dummy variables, remove unused features, etc. This is where you want to expand on the dataset with your own knowledge if you believe data is either missing or could be created from the information within the dataset

Now that you know the process in which to clean a dataset, this will come in handy for the first exercise at the end of the day. 280

Chapter 10

Introduction to Data Analysis

TUESDAY EXERCISES 1 . Loading a Dataset. Go to www. Kaggle. com, click “Datasets” in the top bar menu. Choose a dataset that you like, and download it into the “python_ bootcamp” folder. Then, load the dataset into a Pandas DataFrame using the read_csv method, and display the top five records. 2. Dataset Analysis. This is an open-ended exercise. Run some analysis on the dataset you chose from exercise #1. Try to answer questions like these. a. How many records are there? b. What are the data types of each column? c. Are there duplicate records or columns? d. Is there missing data? e. Is there a correlation between two or more columns?

Today’s focus was on learning the all-important Pandas library and how to work with DataFrames. We used some minor real-life examples, but for the most part, today was just about understanding what you could do in Pandas. For Friday’s project, we’ll use Pandas to help us analyze sporting statistics

Wednesday. Data Visualization Data visualization is one of the most powerful tools an analyst has for two main reasons. Firstly, it is unrivalled in its ability to guide the analyst’s hand in determining “what to look at next. ” Often, a visual is revealing of patterns in the data that are not easily discernable by just looking at DataFrames. Secondly, they are an analyst’s greatest communication tool. Professional analysts need to present their results to groups of people responsible for acting based on what the data says. Visuals can tell your story much better than raw numbers

281

Chapter 10

Introduction to Data Analysis

Like how yesterday’s lesson began, we need to install a library into our virtual environment. To follow along with today’s lesson, cd into the “python_bootcamp” folder and activate the environment. We’ll begin today within the terminal

Types of Charts Knowing which chart to use is important in presenting your data properly. We’ll go over several charts today; however, these are some of the common charts you’ll want to know. •

Line Chart. Exploring data over time

•

Bar Chart. Comparing categories of data and tracks changes over time

•

Pie Chart. Explores parts of a whole, that is, fractions

•

Scatter Plot. Like line charts, tracks correlations between two categories

•

Histogram. Unrelated from bar charts, shows distribution of variables

•

Candlestick Chart. Used a lot in financial sector, that is, can compare a stock over a period

•

Box Chart. Looks identical to candlestick charts, and compares minimum, 1st, median, 3rd quartiles, and max values

Depending on what you need to accomplish in conceptualizing your data, you will be able to choose a specific type of chart to portray your data

Installing Matplotlib To install matplotlib, make sure your virtual environment is activated first, then write the following command into the terminal. $ pip install matplotlib After running the command, it should install a few packages that matplotlib requires. If you’d like to check and make sure you downloaded the proper library, just write out the list command. 282

Chapter 10

Introduction to Data Analysis

I mporting Matplotlib To follow along with the rest of this lesson, let’s open and continue from our previous notebook file “Week_10” and simply add a markdown cell at the bottom that says, “Matplotlib. ” Like Pandas, matplotlib has an industry standard name when you import the library. # importing the matplotlib library from matplotlib import pyplot as plt # industry standard name of plt when importing Go ahead and run the cell. We import pyplot as plt so that we can reference the many charts that matplotlib has to offer

L ine Plot Let’s start with the most basic chart we can create, the line plot. 1. 3. 5. 7. 8. 9. 11

# creating a line plot using x and y coords x, y = [ 1600, 1700, 1800, 1900, 2000 ] , [ 0. 2, 0. 5, 1. 1, 2. 2, 7. 7 ] plt. plot[x, y] # creates the line plt. title["World Population Over Time"] plt. xlabel["Year"] plt. ylabel["Population [billions]"] plt. show[ ]

Go ahead and run the cell. To start, we create our x and y coordinates for plotting. The plot[] method allows us to plot a single line; it just needs the coordinates passed in. Lines 7, 8, and 9 are all for customizing the chart and its appearance. Lastly, we use the show[] method to render the chart. You should output a chart like Figure 10-1

283

Chapter 10

Introduction to Data Analysis

Figure 10-1. Single line plot of population data When you want to add more lines to the chart, you simply apply as many plot[] methods as necessary. Let’s add some more customization to each plot line this time. 1. 3. 4. 6. 7. 9. 10. 11. 12. 14

# creating a line plot with multiple lines x1, y1 = [ 1600, 1700, 1800, 1900, 2000 ] , [ 0. 2, 0. 5, 1. 1, 2. 2, 7. 7 ] x2, y2 = [ 1600, 1700, 1800, 1900, 2000 ] , [ 1, 1, 2, 3, 4 ] plt. plot[x1, y1, "rx-", label="Actual"] # create a red solid line with x dots plt. plot[x2, y2, "bo--", label="Fake"] # create a blue dashed line with circle dots plt. title["World Population Over Time"] plt. xlabel["Year"] plt. ylabel["Population [billions]"] plt. legend[ ] # shows labels in best corner plt. show[ ]

Go ahead and run the cell. By adding a second set of coordinates, we’re able to plot a second line using the plot[] method on line 7. We also specified how the lines should render using shorthand syntax. For the third argument in the plot method, we can pass a string that represents the color, symbols for dots, and the line style. Finally, we added a label to each line for making it easy to read the multiline chart, and we’re able to show it by calling the legend[] method. The output should look like Figure 10-2. 284

Chapter 10

Introduction to Data Analysis

Figure 10-2. Multiline plot of population data

Bar Plot When you need to plot categorical data, a bar plot is a much better choice. Let’s create some fake data for the number of people that chose their favorite movie category and plot it. 1. # creating a bar plot using x and y coords 3. num_people, categories = [ 4, 8, 3, 6, 2 ] , [ "Comedy", "Action", "Thriller", "Romance", "Horror" ] 5. plt. bar[categories, num_people] 7. plt. title["Favorite Movie Category", fontsize=24] 8. plt. xlabel["Category", fontsize=16] 9. plt. ylabel["# of People", fontsize=16] 10. plt. xticks[fontname="Fantasy"] 11. plt. yticks[fontname="Fantasy"] 13. plt. show[ ] Go ahead and run the cell. After creating our data to work with, we create our plot on line 5. Using the bar[] method, we’re able to create the bar plot. The numerical data must always be set up on the y axis, which is why we have our categories in the x axis

285

Chapter 10

Introduction to Data Analysis

We’ve also added several new customizations to the chart. We can adjust the font size, font to be displayed, and even adjust how large the tick marks appear. You should render a chart like Figure 10-3

Figure 10-3. Bar plot of movie categories data

B ox Plot Box plots are useful in situations where you need to compare a single statistic either over time or against categories. They are like candlestick charts in their design, where you can view the min, max, 25% quartile, 75% quartile, and median, which can be useful for displaying data over time. In the case of stocks, currency would be the y axis data and time would be the x axis data. For our example, let’s create two separate groups and display the heights for each. 1. # creating a box plot – showing height data for male-female 3. males, females = [ 72, 68, 65, 77, 73, 71, 69 ] , [ 60, 65, 68, 61, 63, 64 ] 4. heights = [ males, females ] 6. plt. figure[figsize=[15, 8]] # makes chart bigger 7. plt. boxplot[heights] # takes in list of data, each box is its' own array, heights contains two lists 286

Chapter 10

Introduction to Data Analysis

9. plt. xticks[ [ 1, 2 ] , [ "Male" , "Female " ] ] # sets number of ticks and labels on x-axis 10. plt. title["Height by Gender", fontsize=22] 11. plt. ylabel["Height [inches]", fontsize=14] 12. plt. xlabel["Gender", fontsize=14] 14. plt. show[ ] Go ahead and run the cell. In order to plot the data in separate categories, we need to have a list of lists. On line 4, we declare our data which is holding a list of heights for both males and females. When we go to plot our data, it will separate each list into its own box. You’ll notice the figure is much larger than usual; we declare a new figure size on line 6. To render the chart though, we use the boxplot[] method on line 7 and pass heights in as our data. One of the more important lines is number 9, however, where we define the number of categories to appear on the x axis. We order them as “Male” then “Female” because that is the order in which they’re declared on line 4. The chart should render like Figure 10-4

Figure 10-4. Box plot of height data

287

Chapter 10

Introduction to Data Analysis

S catter Plot If you’re familiar with clusters, then you’ll know the importance of scatter plots. These types of plots help to distinguish groups apart from each other by plotting a dot for each set of data. Using two characteristics, like height and width of a flower, we can classify which species a flower belongs to. Let’s create some fake data and plot the points. 1. 2. 3. 5

8. 10. 11. 12. 14

# creating a scatter plot to represent height-weight distribution from random import randint random. seed[2] height = [ randint[58, 78] for x in range[20] ] # 20 records between 4'10" and 6'6" weight = [ randint[90, 250] for x in range[20] ] # 20 records between 90lbs. and 250lbs. plt. scatter[weight, height] plt. title["Height-Weight Distribution"] plt. xlabel["Weight [lbs]"] plt. ylabel["Height [inches]"] plt. show[ ]

Go ahead and run the cell. To create some fake data, we use the randint method from the random module. Here, we’re able to create 20 records for both the height and weight lists. To plot the data, we use the scatter[] method and add some characteristics to the plot. You should get an output like Figure 10-5

288

Chapter 10

Introduction to Data Analysis

Figure 10-5. Scatter plot of height-weight data

Histogram While line plots are great for visualizing trends in time series data, histograms are the king of visualizing distributions. Often, the distribution of a variable is what you’re interested in, and a visualization provides a lot more information than a group of summary statistics. First, let’s see how we can create a histogram. 1. 2. 3. 5. 7. 9. 10. 11. 13

# creating a histogram to show age data for a fake population import numpy as np # import the numpy module to generate data np. random. seed[5] ages = [ np. random. normal[loc=40, scale=10] for x in range[1000] ] # ages distributed around 40 plt. hist[ages, bins=45] # bins is the number of bars plt. title["Ages per Population"] plt. xlabel["Age"] plt. ylabel["# of People"] plt. show[ ]

Go ahead and run the cell. We’ve mentioned the NumPy module previously. It’s used in data science to perform extremely fast numerical calculations. Pandas’ DataFrames are built on top of NumPy arrays. For the purpose of this cell, however, you just need to know that we’re using it to create random numbers that are centralized around a given 289

Chapter 10

Introduction to Data Analysis

number. The number we specify is passed into the loc argument on line 5. The scale argument is how wide we want the random numbers to be apart. Of course, it will still create numbers outside of that range, but it is primarily creating 1000 random numbers centralized around the age of 40. To create the histogram, we use the hist[] method and pass in the proper data. Histograms allow us to see how many times a specific piece of data appeared. In our example, the age of 40 appears more than 60 times. The y axis represents the frequency of the x axis value. The bins argument specifies how many bars you see on the chart. You may be thinking. the more bins the better right? Wrong, there’s always a fine line between too many and too little; often you’ll just have to test out the proper number. We complete this chart by adding customization. The result should look like Figure 10-6

Figure 10-6. Histogram of centrally distributed age data Although the data is fake, we can deduce a lot of information from the chart. We can see outliers that may exist, where the general age range sits, and much more

Importance of Histogram Distribution To see why histograms are so important with understanding central distribution, we’ll need to create some more fake data. We’ll then plot both datasets and see how they stack up. 290

Chapter 10

Introduction to Data Analysis

# showing the importance of histogram's display central distribution florida = [ np. random. normal[loc=60, scale=15] for x in range[1000] ] # assume numpy is imported california = [ np. random. normal[loc=35, scale=5] for x in range[1000] ] # chart 1 plt. hist[florida, bins=45, color="r", alpha=0. 4] # alpha is opacity, making it see through plt. show[ ] # chart 2 plt. hist[california, bins=45, color="b", alpha=0. 4] # alpha is opacity, making it see through plt. show[ ] # chart 3 plt. hist[florida, bins=45, color="r", alpha=0. 4] # alpha is opacity, making it see through plt. hist[california, bins=45, color="b", alpha=0. 4] # alpha is opacity, making it see through plt. show[ ] Go ahead and run the cell. We’re able to output three different histograms within this cell because of the three show methods being called. When you look at the first two histograms, they look identical. It’s tough to see, without looking further into the charts, that the data is completely different. Therefore, to view the data properly, we output the third histogram with both datasets overlapping as seen in Figure 10-7. We’re now able to clearly see the difference in central distribution of each dataset. This is important when it comes to analyzing data. We set alpha to 0. 4 because it allows us to set the opacity. The higher the number, the more solid the data becomes

291

Chapter 10

Introduction to Data Analysis

Figure 10-7. Histogram distribution plotting importance

Note When rendering several charts, matplotlib understands how to separate each plot by resetting the chart to empty after the show method is run, until then all information being plotted will be included in one chart

Saving the Chart Being able to render these charts is wonderful; however, at times you need to use them within a presentation. Luckily for us, matplotlib comes with a method that can save the charts we create to a file. The savefig[] method supports many different file extensions; the most common “. jpg” is what we’ll use. Let’s render a simple plot line chart to the local folder. 1. # using savefig method to save the chart as a jpg to the local folder 3. x, y = [ 1600, 1700, 1800, 1900, 2000 ] , [ 0. 2, 0. 5, 1. 1, 2. 2, 7. 7 ] 5. plt. plot[x, y, "bo-"] # creates a blue solid line with circle dots 7. plt. title["World Population Over Time"] 8. plt. xlabel["Year"] 9. plt. ylabel["Population [billions]"] 11. plt. savefig["population. jpg"]

292

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. You’ll notice a new image file in the “python_bootcamp” folder called “population. jpg” now. If you don’t specify a URL path, it’ll save the image in the local folder where the Jupyter Notebook file is located

Note You can save the chart in other formats like PDF or PNG

Flattening Multidimensional Data Generally, in data analysis you want to avoid 3D plotting wherever possible. It’s not because the information you want to convey isn’t contained within the result, but sometimes it is simply easier to express a point by other means. One of the best ways to represent a third dimension is to use color instead of depth. For instance, imagine that you have three datasets that you need to plot. height, weight, and age. You could render a 3D model, but that would be excessive. Instead, you can render the height and weight like we have before on a scatter plot and color each dot to represent the age. The third dimension of color is now easily readable rather than trying to depict the data using the z axis [depth]. Let’s create this exact scatter plot together in the following. 1. 2. 3. 5. 6. 7

9. 11. 12. 13. 14. 16

# creating a scatter plot to represent height-weight distribution from random import randint random. seed[2] height = [ randint[58, 78] for x in range[20] ] weight = [ randint[90, 250] for x in range[20] age = [ randint[18, 65] for x in range[20] ] # 20 records between 18 and 65 years old plt. scatter[weight, height, c=age] # sets the age list to be shown by color plt. title["Height-Weight Distribution"] plt. xlabel["Weight [lbs]"] plt. ylabel["Height [inches]"] plt. colorbar[label="Age"] # adds color bar to right side plt. show[ ] 293

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. By adding the c argument which represents color, into the scatter plot, we can easily represent three datasets in a 2D manner as seen in Figure 10-8. The color bar on the right side is created via line 14, where we also create the label for it. In some cases, you do need to use the z axis, like representing spatial data. However, when possible, simply using color as the third dimension is easier to not only create but to read as well

Figure 10-8. Rendering a 3D plot using color as the third dimension

WEDNESDAY EXERCISES 1 . Three Line Plot. Create three random lists of data that have 20 numbers between 1 and 10. Then create a line plot with three lines, one for each list. Give each line their own color, dot symbol, and line style. 2. User Information. Create a program that asks any number of users to give a rating between 1 and 5 stars and plots a bar chart of the data when no more users would like to answer. Use the following text as an example of what to ask. >>> What would you rate this movie [1-5]? 4 >>> Is there another user that would like to review [y/n]? y >>> What would you rate this movie [1-5]? 5 >>> Is there another user that would like to review [y/n]? n

*** bar plot renders with two categories and two ratings *** 294

Chapter 10

Introduction to Data Analysis

Today we learned the importance of data visualization and how to create custom charts to show off our data properly. There’s a wide range of plots to choose from when using matplotlib, and each have their own pros and cons which you need to consider when choosing the type of plot. In the end, if you can’t properly show the data to those making the business decisions, then all the data you’ve collected is wasted

Thursday. Web Scraping You may have heard the term “web scraping” previously. In most languages like Python, web scraping is comprised of two parts. sending out a request and parsing the data. We’ll need to use the requests module for the first part and a library called Beautiful Soup for the second part. In a nutshell, the script you write to request data and parse it is called a “scraper. ” For today’s lesson, we’ll be collecting some data using these two libraries. Like yesterday’s lesson, we need to install a library into our virtual environment. To follow along with today’s lesson, cd into the “python_bootcamp” folder, and activate the environment. We’ll begin today within the terminal

Installing Beautiful Soup To install Beautiful Soup, make sure your virtual environment is activated first, then write the following command into the terminal. $ pip install bs4 After running the command, it should install a few packages that Beautiful Soup requires

Importing Beautiful Soup To follow along with the rest of this lesson, let’s open and continue from our previous notebook file “Week_10” and simply add a markdown cell at the bottom that says, “Web Scraping. ” 295

Chapter 10

Introduction to Data Analysis

We need to import requests and the BeautifulSoup class that is within the bs4 library. # importing the beautiful soup and requests library from bs4 import BeautifulSoup import requests Go ahead and run the cell. We’ll use the requests module to send out a request to a given URL. When the URL endpoint is not an API that gives back properly formatted data but rather a web page that renders HTML and CSS, the response that we get back is the code for that web page. In order to parse through this code, we pass it into the BeautifulSoup object, which makes it easy to manipulate and traverse through the code

Requesting Page Content To begin scraping data, let’s send a request to a simple web page that contains only a poem. # performing a request and outputting the status code page = requests. get["http. //www. arthurleej. com/e-love. html"] print[page] Go ahead and run the cell. We’ll get an output of “”. This lets us know that the request to the web page was a success. In order to see what we received back as a response though, we need to access the content attribute of the page variable. # outputting the request response content print[page. content]

296

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. This will output a large string of all the code that was used to write this web page, including tags, styles, scripts, etc. As the book stated earlier, this URL renders a web page, so the response we get back is a string of all the code. The next step is to turn the response into an object that we can work with and parse through the data

Parsing the Response with Beautiful Soup The Beautiful Soup library comes with many attributes and methods that make parsing the code easier for ourselves. Using this library, we can make the code easy to view, scrape, and traverse through. We’ll need to create a BeautifulSoup object to work with by passing the page content into it, along with the type of parser we want to use. In our case, we’re working with HTML code, so we’ll need to use the HTML parser. # turning the response into a BeautifulSoup object to extract data soup = BeautifulSoup[page. content, "html. parser"] print[ soup. prettify[ ] ] Go ahead and run the cell. The prettify[] method will create a well-formatted output for us to view. This makes it easier for us to see the actual code that is written. The soup object knew how to parse the content properly because of the parser that we specified. Beautiful Soup works with other languages, but we’ll be working with HTML for this book. Now that we’ve turned the content into an object we can use, let’s learn how to extract the data from the code

Scraping Data There are many methods to extract data using Beautiful Soup. The following sections will cover a few of the main methods in doing so. Basic HTML knowledge is assumed for this section

297

Chapter 10

Introduction to Data Analysis

. find[ ] To find a specific element within the code, we can use the find[] method. The argument we pass is the tag that we want to search for, but it will only find the first instance and return it. Meaning that if there are four bold element tags within our code, and we use this method to find a bold tag, it will respond back with only the first bold element tag found. Let’s try it out. # using the find method to scrape the text within the first bold tag title = soup. find["b"] print[title] print[ title. get_text[ ] ] # extracts all text within element Go ahead and run the cell. If you look at the code using the inspector tab in your web browser’s console tools, you’ll be able to see that the first bold tag within the code is the title of the poem. The first print statement results in “Love” and the second is simply the text within the element. We were able to extract the text by using the get_ text[] method

. find_all[ ] To find all instances of a given element, we use the find_all[] method. This will give us back a list of all tags found within the code. Let’s find all bold tags within the code and extract the text. # get all text within the bold element tag then output each poem_text = soup. find_all["b"] for text in poem_text. print[ text. get_text[ ] ] Go ahead and run the cell. If you were to look at the code using your inspector tools, you would notice that all the text is within bold tags. The result is an output of the entire poem

298

Chapter 10

Introduction to Data Analysis

Finding Elements by Attributes All HTML elements have attributes associated with them, whether it’s a style, id, class, etc. , you can use Beautiful Soup to find elements with a specific attribute value. Let’s request a response from my personal Github page and find the element that shows my username. 1. 3. 4. 6

# finding an element by specific attribute key-values page = requests. get["https. //github. com/Connor-SM"] soup = BeautifulSoup[page. content, "html. parser"] username = soup. find[ "span", attrs={ "class" . "vcard-username" } ] # find first span with this class 8. print[username] # will show that element has class of vcard- username among others 9. print[ username. get_text[ ] ] Go ahead and run the cell. We send a request to Github and parse the content into a BeautifulSoup object to work with. On line 6, we search for a span tag element that has an attribute of class, whose value is “vcard-username. ” This will output the entire span tag, including text, attributes, and the syntax on line 8. Lastly, we extract the text on line 9 to output the username associated with this page

Note Finding elements by attributes also works with the find_all method. You can also include multiple key-value pairs to look for within the attrs argument

DOM Traversal This section will cover how to extract information by traversing through the DOM hierarchy. The DOM, short for Document Object Model, is a concept in web design that describes the relationships and structure between elements on a browser. All elements on a web page belong to one of three relationships. 1. Parent-Child 2. Sibling 3. Grandparent-Grandchild 299

Chapter 10

Introduction to Data Analysis

This concept is important to understand when you are web scraping because you may need to access the children of a specific element. The children are in reference to all elements within another element. Take the following HTML code, for instance

Title Sub-title

Text

In this example, the element is the parent of the h1, h3, and p elements. Those three elements are known as the children. If we wanted to extract all the text from within this element, we could access the children elements

Note In the preceding example, the h1, h3, and p elements are all siblings. The body would be the parent of the div element and the grandparent of the h1, h3, and p elements. As the DOM is a web design concept, it’s covered briefly in this book. If you would like more information on the subject or basic HTML knowledge, be sure to visit the w3schools5 resource

Accessing the Children Attribute Lucky for us, when Beautiful Soup converts the page content into an object, it keeps track of the children for all elements. This allows us to traverse through the DOM and parse data as we see fit. Let’s grab the poem from earlier and convert the response into a BeautifulSoup object

www. w3schools. com/js/js_htmldom. asp

300

Chapter 10

Introduction to Data Analysis

# traversing through the DOM using Beautiful Soup – using the children attribute page = requests. get["http. //www. arthurleej. com/e-love. html"] soup = BeautifulSoup[page. content, "html. parser"] print[soup. children] # outputs an iterator object Go ahead and run the cell. The children elements within the soup object are stored within an iterator. For the following exercise, let’s extract the title element from the web page

Understanding the Types of Children Before we begin, we first need to understand the types of children within the BeautifulSoup object. Let’s convert the iterator into a list of elements that we can loop over. # understanding the children within the soup object for child in list[ soup. children ]. print[ type[ child ] ] Go ahead and run the cell. As a result, we’ll get four children but only three different types. •

–– A Doctype object is in reference to the Docstring that defines the HTML version used

•

–– A string corresponds to a bit of text within a tag. Beautiful Soup uses the NavigableString class to contain these bits of text. So far, we’ve used the get_text[] method to extract text; however, you can use the following to extract data as well. >>> tag. string Which results in NavigableString type

301

Chapter 10

•

Introduction to Data Analysis

–– A Tag object corresponds to an XML or HTML tag in the original document. When we access the elements and their text, we’ll be accessing the original tags to do so

If you were to output each of these objects, you’d find that all the code, aside from the Doctype, appear in the Tag object

Accessing the Tag Object If we want to access the text within the title tag, we need to traverse into its parent first, which happens to be the head tag. Now that we know the elements that we’re looking for reside in the Tag object, we need to save that object to a variable and output the sections within it. # accessing the . Tag object which holds the html – trying to access the title tag html = list[ soup. children ][2] for section in html. print["\n\n Start of new section"] print[section] Go ahead and run the cell. When you output each section within our HTML variable, you’ll realize that there’s an empty section at the first index, before the location of the head element. We output the print statement for each new section, in case an empty string occupies an index

Accessing the Head Element Tag Now that we know the head element is at index 1 of the HTML children, we can perform the same execution to access each child within the head. # accessing the head element using the children attribute head = list[ html. children ][1] for item in head. print["\n\n New Tag"] print[item] 302

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. When you output each tag within the head, you’ll notice the title tag that we’ve been searching for resides at index 1

Note Remember that each object stored in these variables is an iterator and can be type converted into lists

Scraping the Title Text The final step is to extract the text from the title tag. # scraping the title text title = list[ head ][1] print[title. string] # . string is used to extract text as well print[ type[title. string] ] # results in NavigableString print[ title. get_text[ ] ] Go ahead and run the cell. We’ve just traversed through the DOM in order to scrape the text from our title element

Note The ability to access an object’s children elements allows us to create modular or automated web scrapers that can perform a various number of tasks. As most sites follow a similar style on their web pages, creating a script that would extract information on a single page would allow us to do so on many other pages if we knew the proper pattern. For instance, the online statistical database for baseball called baseball-reference holds data for all baseball players throughout the history of the MLB. Each player has a unique identifier on the web site’s URL. If you wrote a parsing script that would extract information for one player, you would be able to write a loop to extract information from all players in the database

303

Chapter 10

Introduction to Data Analysis

THURSDAY EXERCISES 1 . Word Count. Write a program that counts how many words are in the following link. www. york. ac. uk/teaching/cws/wws/webpage1. html. Use the requests module and Beautiful Soup library to extract all text. 2. Question #2. Using the following link, extract every stadium name out of the table. https. //en. wikipedia. org/wiki/List_of_current_ National_Football_League_stadiums. There should be 32 total names

Today we learned how to collect information via a web scraper. Using the requests module, we can receive a response of code that renders a given web page. We can then turn this response into an object to easily parse and extract data via the Beautiful Soup library. In tomorrow’s lesson, we’ll use all the libraries that we learned throughout this week in order to analyze information that we scrape off the Web

Friday. Web Site Analysis Today’s project will include the requests module, Beautiful Soup, and matplotlib libraries. The goal for this project is to create a script that will accept a web site to scrape and display the top words used within the site. We’ll plot the results within a nicely formatted bar plot, making it easier to understand for those looking at the data. To follow along with today’s lesson, cd into the “python_bootcamp” folder, and activate the environment. We’ll continue from our previous notebook file “Week_10” and add a markdown cell at the bottom that says, “Friday Project. Website Analysis. ”

F inal Design As we do each week, we need to lay out a design of what the final program should look like, as well as how it should function. For testing purposes, we’ll use Microsoft’s home page. Eventually, we’ll want the final output to look like Figure 10-9

304

Chapter 10

Introduction to Data Analysis

Figure 10-9. Analyzing Microsoft’s most frequent words on their home page We’re going to make the program continually ask the users if they’d like to scrape a web site, followed by accepting the users’ input for the site they’d like to analyze. After that, we can perform our web site; filter out all information that isn’t useful like article words, newline characters, etc. ; and finally be able to create the bar plot. The program output should look like the following. >>> >>> >>> >>> >>> >>>

Would you like to scrape a website [y/n]? y Enter a website to analyze. https. //www. microsoft. com/en-us/ The top word is. Microsoft *** show bar plot *** Would you like to scrape a website [y/n]? n Thanks for analyzing. Come back again

In order to get the output working properly, we need to create an outline of the steps the program will require. Feel free to take a second to try and write them out yourself. The program that we create will need to perform the following steps. 1. Ask users if they’d like to web scrape a site. •

If the user says yes

1. Accept input from users about the site they would like to scrape

2. Send a request to the web site

3. Parse all text from page content within the request response. 305

Chapter 10

Introduction to Data Analysis

4. Filter out all non-text elements, such as scripts, comments, etc

5. Filter out all article words and useless characters like newlines and tabs

6. Loop over all remaining text and count the frequency of each word

7. Keep the top seven words and display the most used word

8. Create a bar plot of the top seven words

•

If the user says no

1. Exit the program and display a thank you message

2. Continue to ask the users if they’d like to scrape a site until they say no

Importing Libraries We need to start off by importing all the libraries that we’ll be using throughout this project. Let’s put all the imports in their own cell so that we only need to run the import once rather than importing them each time we run the code for the program. 1. 2. 3. 4. 5. 6

# import all necessary libraries import requests import matplotlib. pyplot as plt from bs4 import BeautifulSoup from bs4. element import Comment from IPython. display import clear_output

Go ahead and run the cell. The only new import that you haven’t seen before is the import on line 5. To analyze only words that appear on the page, we’ll need to filter out all text within the comments somewhere in the program. Using the Comment class later will allow us to recognize if the text is within a comment or not, so that we can filter it out properly

306

Chapter 10

Introduction to Data Analysis

Creating the Main Loop Let’s write all the following code in the next cell, so that we don’t have to rerun all the imports. We’ll need to create a main loop, so that we can continue to ask the users if they’d like to scrape a web site. When they do, we’ll simply print the site they entered for now. 1. # main loop should ask if user wants to scrape, then what site to scrape 2. while input["Would you like to scrape a website [y/n]? "] == "y". 3. try. 4. clear_output[ ] 6. site = input["Enter a website to analyze. "] 8. print[site] # remove after runs properly 9. except. 10. print["Something went wrong, please try again. "] 12. print["Thanks for analyzing. Come back again. "] Go ahead and run the cell. This gives us our basic loop structure of asking the users for input about the site they would like to scrape. If they choose not to scrape, then we output a thank you message. We want to wrap the main portion of this loop in a try-except clause because we can’t expect the user to always input a valid URL. If the user doesn’t put in a valid URL, an error could occur. Now, we don’t have to worry about the error, and the program will continually ask the users if they’d like to scrape another web site

Note Anytime you restart the notebook, you’ll need to run the import cell again

Scraping the Web Site Now that we’ve accepted input from the user, we need to scrape the web site. It would be best to separate this code from the main loop, so let’s put it inside of its own function. 1. # request site and return top 7 most used words 2. def scrape[site]. 3. page = requests. get[site] 307

Chapter 10

Introduction to Data Analysis

5. soup = BeautifulSoup[page. content, "html. parser"] 7. print[ soup. prettify[ ] ] # remove after runs properly 9. # main loop should. ◽◽◽ 14. site = input["Enter a website. ◽◽◽ 16. scrape[site] 17. except. ◽◽◽ Go ahead and run the cell. After we ask the user to input a web site, we call the scrape function on line 16 with the site variable as the argument. The program will then request the content from the site and parse it with BeautifulSoup. For testing purposes, we use the prettify[] method to output the response that we get back. If you look through the output, you’ll notice there’s a lot of text inside of element tags that don’t show up on the web site. Tags like scripts and comments include text that we do not want to include in our analysis, so we’ll need to filter them out eventually. Once they are filtered out, we’ll be left with the actual text that appears on the home page of the web site. Remove the code on line 7 once the cell runs properly

Note To follow along with the book, use the Microsoft URL. www. microsoft. com/en-us/

Scrape All Text Now that we’re receiving a response, we can begin to parse all the text within the page content. 1. # request site and return top 7 most used words 2. def scrape[site]. 3. page = requests. get[site] 5. soup = BeautifulSoup[page. content, "html. parser"] 7. text = soup. find_all[text=True] # will get all text within the document 9. print[text] # remove after runs properly 11. # main loop should. ◽◽◽

308

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. We use the find_all method from our BeautifulSoup object in order to grab every piece of text contained within the page. Notice this gives us back a list that contains newline characters, tab characters, scripts, comments, and the actual text that we need within the proper text elements like h1, p, etc. The next step is to filter out those unnecessary elements. Remove line 9 once the cell runs properly; this is used for testing purposes only

Filtering Elements Although we’re parsing the text from the page content, much of the text is within elements that we don’t want to include in our analysis. Let’s take the script tag, for instance. The script tag is used to write JavaScript within the web page. If we were to include this in our analysis, it would lead to improper results. The same goes for HTML comments, which look like the following

Any text within an HTML comment is not seen on the web page. It’s the same concept as a Python comment. They’re used for programmers and not read by compilers. Knowing that we only want to perform an analysis on words that appear on the page, we must filter out these unnecessary elements. 1. # filter out all elements that do not contain text that appears on site 2. def filterTags[element]. 3. if element. parent. name in [ "style", "script", "head", "title", "meta", "[document]" ]. 4. return False 6. if isinstance[element, Comment]. 7. return False 9. return True 17. text = soup. find_all[text=True]. ◽◽◽ 19. visible_text = filter[filterTags, text] 21. for text in visible_text. 22. print[text] # remove after runs properly 24. # main loop should. ◽◽◽

309

Chapter 10

Introduction to Data Analysis

Go ahead and run the cell. After we parse all the text from the page content on line 17, we filter out the unnecessary elements on line 19. The filter method is used to loop over every item within our text variable and apply the filterTags function to know if the item should be included. We basically want to return True if the item is not a comment or element tag that shouldn’t be included. Line 3 is where we check to see if the text is within an element that we do not want to include. All the strings included in the list on line 3 are HTML elements. Comments are slightly different though because they are not elements. To know if an item is a comment, we need to use Beautiful Soup’s Comment object

Note When Beautiful Soup parses the page content, it recognizes HTML as one of four objects. Tag, NavigableString, BeautifulSoup, and Comment. On line 6, we check to see if the item is an instance of the Comment object. If it is, we return False so that we can filter it out. If the item is not a comment or its parent is a valid element, then we return True. We then loop over the variable to output each item on line 21. We’re now left with only the words that appear on the web page. You’ll notice that there is a lot of white space between each word, which is the next step. Remove line 22 after the cell runs properly, as it is only used for testing purposes

Filtering Waste The next step is to filter out any escaping characters [newlines, tabs]; article words, such as a, an, the, etc. ; and any other words we deem useless. When we perform the analysis on a site, we want to see the topmost descriptive words. Knowing that a site’s top word is “the” does not depict any information about the site. For example, when scraping a news site, we would expect to see keywords about the top story of the day. To perform this filter, we’ll need to create another function that will handle removing what we call “waste”. 1. filter article words and hidden characters 2. def filterWaste[word]. 3. bad_words = [ "the", "a", "in", "of", "to", "you", "\xa0", "and", "at", "on", "for", "from", "is", "that", "his", 4. "are", "be", "-", "as", "&", "they", "with", "how", "was", "her", "him", "i", "has", ". " ] 310

Chapter 10

Introduction to Data Analysis

6. if word. lower[ ] in bad_words. 7. return False 8. else. 9. return True 11. # filter out all elements that do not. ◽◽◽ 31. for text in visible_text. ◽◽◽ 32. words = text. replace["\n", ""]. replace["\t", ""]. split[" "] # replace all hidden chars 34. words = list[ filter[filterWaste, words] ] 36. for word in words. 37. print[word, end=" "] # remove after runs properly 39. # main loop should. ◽◽◽ Go ahead and run the cell. We start this process on line 31 by looping over each item in visible_text and replacing all newline and tab characters with empty strings. Then we run a filter on that item with our filterWaste function to see if we need to remove it from the list on line 34. Within the filterWaste function, we define a set of words or characters that we want to filter out called bad_words. After converting the item to lowercase, we check to see if it exists within bad_words; if it does, we return False; otherwise, we return True to keep it within the list. On line 37 we output each word after we perform the filter. The words contained in this output are descriptive and informative enough to tell us what the web site is mainly talking about. Remove line 37 once the cell runs properly, as this is used for testing purposes only

Note You can add more words or characters to the bad_words data collection if you’d like. This is to simply get us by for the time being. There is a library called NLTK which has a large list of article words and characters that you can use for larger projects when necessary

311

Chapter 10

Introduction to Data Analysis

Count Word Frequency After we’ve filtered out all the waste and elements, we’re left with the proper words to run our analysis on. The next step is to count the number of times a given word appears. A dictionary would be best practice to keep track of the count for each word because we can use the word as the key and the frequency as the value. 29. visible_text = filter[filterTags, text]. ◽◽◽ 31. word_count = { } 33. for text in visible_text. ◽◽◽ 38. for word in words. ◽◽◽ 39. if word . = "". # if it doesn't equal an empty string 40. if word in word_count. 41. word_count[ word ] += 1 42. else. 43. word_count[ word ] = 1 45. print[word_count] # remove after runs properly 53. # main loop should. ◽◽◽ Go ahead and run the cell. On line 31 we create our dictionary to keep track of the word count. As we loop over each word in our words list, we first check to see if it’s an empty string because we converted all escaping characters to empty strings and certainly don’t want to include them in the count. On line 40, we check to see if the word has already been added to the dictionary, in which case we would add one to the value [line 41]. If it hasn’t been added to the dictionary yet, then we simply add a key of the word and a value of 1. We output the result on line 45 to see the word and its frequency. Now we can view all the words and the times they occurred; however, we want to plot the top words, so we’ll need to sort the dictionary next. Remove line 45 after the cell runs properly

312

Chapter 10

Introduction to Data Analysis

Sort Dictionary by Word Frequency In order to output or return the top seven words, we’ll need to sort the dictionary by the value. 43. word_count[ word ] = 1 ◽◽◽ 45. word_count = sorted[word_count. items[ ], key=lambda kv. kv[1], reverse=True] # sort on value 47. print[ word_count[ . 7 ] ] # remove after runs properly 49. # main loop should. ◽◽◽ Go ahead and run the cell. To understand what’s going on here, first we need to clarify what the output of . items[] becomes. >>> d = { "word" . 1, "hello" . 2 } >>> result = d. items[ ] >>> print[result] dict_items[ [ ["word", 1], ["count", 2] ] ] The result is a couple tuples within a list. Normally, using the sorted function on a dictionary would result in a list sorted by the key; however, when we use the lambda function to sort based on value by changing the key argument, it’s really taking in each of these tuples and sorting based on index 1, which is the value, which represents the frequency of the word. Remember that the sorted function returns a list. When we run line 45, it results in a list of tuples sorted from highest to lowest value because of the argument “reverse=True”. Lastly, we output the top seven words by slicing. Remove line 47 after the cell runs properly

Displaying the Top Word Now that we’re getting the top seven words, let’s output the most used word for good measure. 45. word_count = sorted. ◽◽◽ 47. return word_count[ . 7 ] 49. # main loop should. ◽◽◽ 54. site = input["Enter a website. ◽◽◽ 313

Chapter 10

Introduction to Data Analysis

56. top_words = scrape[site] 58. top_word = top_words[0] # tuple of [word, count] 60. print["The top word is. { }". format[ top_word[0] ]] # don't remove 61. except. ◽◽◽ Go ahead and run the cell. We start by returning the top seven words from the scrape function rather than printing them out. This will return the list of tuples back to our main loop on line 56 and save them into the top_words variable. After that, we assign the first tuple into our top_word variable because it represents the most frequent word used on the page. Lastly, we output the top word on line 60 by accessing the zero index of the tuple that contains the word and frequency count

Graphing the Results The last step in our program that we need to execute is graphing the results within a bar plot. 1. # graph results of top 7 words 2. def displayResults[words, site]. 3. count = [ item[ 1 ] for item in words ][ . . -1 ] # reverses order 4. word = [ item[ 0 ] for item in words ][ . . -1 ] # gets word out of reverses order 6. plt. figure[ figsize=[20, 10] ] # define how large the figure appears 8. plt. bar[word, count] 10. plt. title["Analyzing Top Words from. { }. ". format[ site[ . 50 ] ], fontname="Sans Serif", fontsize=24] 11. plt. xlabel["Words", fontsize=24] 12. plt. ylabel["# of Appearances", fontsize=24] 13. plt. xticks[fontname="Sans Serif", fontsize=20] 14. plt. yticks[fontname="Sans Serif", fontsize=20]

314

Chapter 10

Introduction to Data Analysis

16. plt. show[ ] 77. print["The top word is. ◽◽◽ 79. displayResults[top_words, site] # call to graph 80. except. ◽◽◽ Go ahead and run the cell. We’ll get the final output that we we’ve been programming toward throughout this entire lesson. The graph will show the top seven words and their frequency in a nicely formatted bar plot by calling the displayResults function on line 79. We pass in the arguments of top_words and site in order to give the graph its data and title. On lines 3 and 4, we separate the values and words into their respective lists using comprehension. We then reverse them using the slice at the end; otherwise, it would show the graph from highest to lowest. The bar graph is plotted on line 8 by passing this data into the bar method. Lastly, we add a title, labels, and some styles

Final Output The program is now complete and can be used to analyze the top words for any web site. Note that some sites can and do block the request, in which case the exception will be executed. You can find all the code for this week, as well as this project in the Github repository

Today we learned how to create a program that would scrape any site input by the user. It was important to see how we could use several of these data analysis libraries together to create a useful tool. Now we can use this web tool to analyze news sites and see trending information. Of course, this is a simple web scraper but with proper modifications could become a more useful tool

Weekly Summary There are many Python libraries that are useful for data analysis. Throughout this week, we covered some of the most widely used modules and libraries in the industry. This week has prepared you to begin learning more about analysis and how to implement these libraries further to improve your skills. Having covered virtual environments, you’ll know how to work with Python libraries and manage your packages. Using the 315

Chapter 10

Introduction to Data Analysis

requests module, we were able to call APIs and parse page content. This module allows our programs to communicate with other software in order to improve user experience. One of the most important libraries this week, however, is the Pandas data analysis library. It’s used by data analysts and scientists in almost every field. It gives you the power to use Python and SQL together; it’s extremely efficient and makes working with databases and files much easier. It’s truly the end all, be all library for analysis. Data analysis wouldn’t be complete, however, without visualization. Using matplotlib we were able to cover a variety of plots that we could use and how to effectively showcase our data. It’s important to remember that data without proper visualization will never produce quality results. The last library we covered was for web scraping with Beautiful Soup, an important library to help make sense of other languages within Python, where we were able to parse information and text from a page request. Lastly, we coupled three of these four lessons within a program to create a web scraping analysis tool. To further your learning on this topic, you can use www. elitedatascience. com or learn about the data science libraries, such as NLTK and SK-Learn

Challenge Question Solution As we learned in the lesson on Wednesday, it’s difficult and time-consuming to try and implement a 3D visualization of data. The question this week asked how we could simplify this while expressing a 3-dimensional graph. Having covered the answer toward the end of the lesson on Wednesday, we found that we can use color as a third dimension. This allows us to keep a graph within 2-dimensional space but have three dimensions. This is important to keep in mind when trying to simplify data for those who make decisions based on visualizations

W eekly Challenges To test out your skills, try these challenges. 1. User Input. As we saw in our Friday project, there were many article words or characters that we wanted to filter out. Unfortunately, we can’t keep track of all of them for each site. For this challenge, implement a block of code that asks the users what additional words or characters they would like to filter out so that they may alter the words shown. 316

Chapter 10

Introduction to Data Analysis

2. Saving the Plot. Implement a block of code that asks the users if they would like to save the file. If they do, be sure to ask the users what they would like to call the image and save it with that name. 3. Pandas Implementation. Rather than using a dictionary to track the words from the web site scrape in our project, implement Pandas into the code to track the information. You should be able to perform a head or tail function to see the top or bottom most frequently used words. 4. Saving the Data. After implementing Pandas to save the unique words and their frequency, output the information to a CSV for each site. The name of the file should represent the web site name, for example, “microsoft_frequent_words. csv”

317

AFTERWORD

ost-Course. What to P Do Now? Often, when a student finishes a class or a reader finishes a book, they’re left wondering where to go next? It’s a broad question, especially when you’re new to this field. If you’ve been programming for a long time, this is probably easy for you, as you most likely read this book to pick up or switch to Python. For the rest of you, it’s much tougher, especially if this is the first book you’ve read on programming. My answer is generally the same to each person that asks… what interests you? Your answer will affect my advice for you. What follows is a list of resources, video channels, and other books to read based on the category that interests you. Each section has been separated by the types of jobs you can receive with knowledge of Python. When you embark on becoming a programmer, remember to give back to the community. As a developer, we use resources like Quora or Stack Overflow to help get answers to our problems. Be the type of person that answers respectfully and helpfully. Remember those that helped you out did so on their free time. Without the continued help throughout our community, we would not learn and continue to improve

Back-End Development with Python When you become a developer, there are many roles you can apply for. Back-end development is made for those of us who don’t want to worry about the design, interfaces, or anything front end related but rather focus on the algorithms, speed, and mechanics of the software itself. If you find passion in Python and back-end concepts like SQL, servers, requests, and APIs, then this is a great place for you to start

319

Afterword

Post-Course. What to Do Now?

Full-Stack Development with Python Full-stack development encompasses front end, back end, server side, and web dev all in one. There’s a lot to learn when you want to become a full-stack engineer. If you’re eager to learn more about how to build full-scale web sites, software as a service, networking, and more… then this path would help

Data Analysis with Python We only began to scratch the surface of what you can do with Python in data analytical roles. If you found that you enjoyed Week 10 within the book, then this would certainly be a great next step for you

Data Science with Python We never touched upon this subject, merely pointed out certain concepts in Week 10. Data science encompasses many different fields of study. machine learning, artificial intelligence, computer systems, web scraping, forms of data analysis, and much more. If you think you’d be interested in learning more about these topics, this is the right step for you

R esources Table 1 shows general resources. These are resources just to get you started on the right path. There are many more valuable resources out there for each of these categories

320

Afterword

Post-Course. What to Do Now?

Table 1. Resources Name

Type

Bài Viết Liên Quan

Toplist mới

Bài mới nhất

Chủ Đề