Beginning r the statistical programming language (wrox) pdf


Beginning R: An Introduction to Statistical Programming. Read more Beginning Programming (Wrox Beginning Guides) C++ Programming Language, The. Beginning R: the Statistical programming language. Published Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Programmer to .. PDF Device Driver. Branch: master. R/Beginning R - The Statistical Programming Lang. - M. Gardener (Wrox, ) Find file Copy path. Fetching contributors Cannot.

Language:English, Spanish, Arabic
Genre:Children & Youth
Published (Last):09.04.2016
Distribution:Free* [*Registration Required]
Uploaded by: WESTON

70261 downloads 125261 Views 17.63MB PDF Size Report

Beginning R The Statistical Programming Language (wrox) Pdf

Beginning. R the StatiStical pRogRamming language. Mark Gardener Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Programmer to Programmer, and related trade dress are trademarks or registered .. PDF Device Driver. Beginning R: The Statistical Programming Language. by: Mark Gardener Beginning R is available from the publisher Wrox or see the entry on uk. Beginning R: The Statistical Programming Language (X) cover Table of Contents (PDF) Chapter 2: Starting Out: Becoming Familiar with R

Publisher Comments Gain better insight into your data using the power of R While R is very flexible and powerful, it is unlike most of the computer programs you have used. In order to unlock its full potential, this book delves into the language, making it accessible so you can tackle even the most complex of data analysis tasks. Simple data examples are integrated throughout so you can explore the capabilities and versatility of R. Along the way, you'll also learn how to carry out a range of commonly used statistical methods, including Analysis of Variance and Linear Regression. By the end, you'll be able to effectively and efficiently analyze your data and present the results. Beginning R: Discusses how to implement some basic statistical methods such as the t-test, correlation, and tests of association Explains how to turn your graphs from merely adequate to simply stunning Provides you with the ability to define complex analytical situations Demonstrates ways to make and rearrange your data for easier analysis Covers how to carry out basic regression as well as complex model building and curvilinear regression Shows how to produce customized functions and simple scripts that can automate your workflow wrox. Code Downloads Take advantage of free code samples from this book, as well as code samples from hundreds of other books, all ready to use. Read More Find articles, ebooks, sample chapters and tables of contents for hundreds of books, and more reference resources on programming topics that matter to you. Wrox Beginning guides are crafted to make learning programming languages and technologies easier than you think, providing a structured, tutorial format that guides you through all the techniques involved. Visit the Beginning R website at www. This book examines this complex language using simple statistical examples, showing how R operates in a user-friendly context.

This one le contains all the example datasets and scripts you need for the whole book. Once you have the le on your computer you can load it into R by one of several methods: For Windows or Mac you can drag the Beginning.

RData le icon onto the R program icon; this will open R if it is not already running and load the data. If R is already open, the data will be appended to anything you already have in R; otherwise only the data in the le will be loaded. RData The Beginning. RData le must be in your default working directory and if it is not you must specify the location as part of the lename. However, no one is perfect, and mistakes do occur.

If you nd an error in one of our books, like a spelling mistake or faulty piece of code, we would be very grateful for your feedback. By sending in errata you may save another reader hours of frustration and at the same time you will be helping us provide even higher quality information. Then, on the book details page, click the Book Errata link. On this page you can view all errata that has been submitted for this book and posted by Wrox editors.

If you dont spot your error on the Book Errata page, go to www.

Well check the information and, if appropriate, post a message to the books errata page and x the problem in subsequent editions of the book. The forums are a web-based system for you to post messages relating to Wrox books and related technologies and interact with other readers and technology users. The forums offer a subscription feature to e-mail you topics of interest of your choosing when new posts are made to the forums. Wrox authors, editors, other industry experts, and your fellow readers are present on these forums.

Beginning R: The Statistical Programming Language - Wrox

To join the forums, just follow these steps: 1. Go to p2p. Read the terms of use and click Agree. Complete the required information to join as well as any optional information you wish to provide and click Submit.

You will receive an e-mail with information describing how to verify your account and complete the joining process. Once you join, you can post new messages and respond to messages other users post. You can read messages at any time on the web. If you would like to have new messages from a particular forum e-mailed to you, click the Subscribe to this Forum icon by the forum name in the forum listing.

For more information about how to use the Wrox P2P, be sure to read the P2P FAQs for answers to questions about how the forum software works as well as many common questions specic to P2P and Wrox books. It is a sophisticated computer language and environment for statistical computing and graphics.

R is available from the R-Project for Statistical Computing website www. It has quickly gained a widespread audience. It is currently maintained by the R core-development team, a hard-working, international team of volunteer developers. The R project webpage is the main site for information on R. At this site are directions for obtaining the software, accompanying packages, and other sources of documentation.

Beginning R: The Statistical Programming Language

Many routines have been written for R by people all over the world and made freely available from the R project website as packages. However, the basic installation for Linux, Windowsor Mac contains a powerful set of tools for most purposes. Because R is a computer language, it functions slightly differently from most of the programs that users are familiar with. You have to type in commands, which are evaluated by the program and then executed. This sounds a bit daunting to many users, but the R language is easy to pick up and a lot of help is available.

It is possible to copy and paste in commands from other applications for example: word processors, spreadsheets, or web browsers and this facility is very useful, especially if you keep notes as you learn. Additionally, the Windowsand Macintosh versions of R have a graphical user interface GUI that can help with some of the basic tasks.

R can deal with a huge variety of mathematical and statistical tasks, and many users fi nd that the basic installation of the program does everything they need. However, many specialized routines have been written by other users and these libraries of additional tools are available from the R website.

If you need to undertake a particular type of analysis, there is a very good chance that someone before you also wanted to do that very thing and has written a package that you can download to allow you to do it.

R is open source, which means that it is continually being reviewed and improved. R runs on most computersinstallations are available for Windows, Macintosh, and Linux.

It also has good interoperability, so if you work on one computer and switch to another you can take your work with you. R handles complex statistical approaches as easily as more simple ones. Therefore once you know the basics of the R language, you can tackle complex analyses as easily as simple ones as usual it is the interpretation of results that can be the really hard bit.

Throughout the text, the use of these commands is illustrated, which is indeed the point of the book. Where a command is illustrated in its basic form, you will see a fi xed width font to mimic the R display like so: help. Keep these conventions in mind as you are reading this chapter and they will come into play as soon as you have R installed and are ready to begin using it!

The R Website The R website at www. It is also a good place to look for help items and general documentation as well as additional libraries of routines. If you use Windowsor a Mac, you will need to visit the site to download the R program and install it. You can also fi nd installation fi les for many Linux versions on the R website. The R website is split into several parts; links to each section are on the main page of the site.

The two most useful for beginners are the Documentation and Download sections. In the Documentation section see Figure a Manuals link takes you to many documents contributed to the site by various users. You can access these and a variety of help guides under Manualsd Contributed Documentation. These are especially useful for helping the new user to get started.

Additionally, a large FAQ section takes you to a list that can help you fi nd answers to many question you might have. There is also a Wiki, and although this is still a work in progress, it is a good place to look for information on installing R on Linux systems. In the Downloads section you will fi nd the links from which you can download R. The following section goes into more detail on how to do this.

The benefit of having this network of websites is improved download speeds. For all intents and purposes, CRAN is the R website and holds downloads including old versions of software and documentation e. When you perform searches for R-related topics on the internet, adding CRAN or R to your search terms increases your results. To get started downloading R, youll want to perform the following steps: 1. Visit the main R web page www. The starting page of the CRAN website appears once you have selected your preferred mirror site.

This page has a Software section on the left with several links. Choose the R Binaries link to install R on your computer see Figure You can also click the link to Packages, which contains libraries of additional routines.

However, you can install these from within R so you can just ignore the Packages link for now. The Other link goes to a page that lists software available on CRAN other than the R base distribution and regular contributed extension packages. This link is also unnecessary for right now and can be ignored as well. Once you click the R Binaries link you move to a simple directory containing folders for a variety of operating system see Figure Select the appropriate operating system on which you will be downloading R and follow the link to a page containing more information and the installation files that you require.

Downloading the. Versions of Windowspost XP require some of additional steps to make R work properly. For Vista or later you need to alter the properties of the R program so that it runs with Administrator privileges. To do so, follow these steps: 1. Click the Windowsbutton this used to be labeled Start. Select Programs. Choose the R folder. Right-click the R program icon to see an options menu see Figure You will then see a new options window.

Run R by clicking the Programs menu, shortcut, or quick-launch icon like any other program.

This is important, as you see later. R will save your data items and a history of the commands you used to the disk and it cannot do this without the appropriate access level. Once the file has downloaded it may open as a disk image or not depending how your system is set up.

Once the DMG file opens you can double-click the installer file and installation will proceed see Figure Installation is fairly simple and no special options are required. Once installed, you can run R from Applications and place it in the dock like any other program. Downloadable install fi les are available for many Linux systems on the R website see Figure The website also contains instructions for installation on several versions of Linux.

Many Linux systems also support a direct installation via the Terminal. These repositories are not always very up-todate however, so if you want to install the very latest version of R, look on the CRAN website for instructions and an appropriate install fi le. The exact command to install direct from the Terminal varies slightly from system to system, but you will not go far wrong if you open the Terminal and type R into it.

If R is not installed the most likely scenario , the Terminal may well give you the command you need to get it see Figure! In other systems you may need two elements to install, like so: sudo apt-get install r-base r-base-dev The basic R program and its components are built from the r-base part. For many purposes this is enough, but to gain access to additional libraries of routines the r-base-dev part is needed. Once you run these commands you will connect to the Internet and the appropriate fi les will be downloaded and installed.

Once R is installed it can be run through the Terminal program, which is found in the Accessories part of the Applications menu. On a Macintosh the program is located in the Applications folder and you can drag this to the dock to create a launcher or create an alias in the usual manner.

On Linux the program is launched via the Terminal program, which is located in the Accessories section of the Applications menu.

Once the R program starts up you are presented with the main input window and a short introductory message that appears a little different on each OS: In Windowsa few menus are available at the top as shown in Figure In this case you also have some menus available and they are broadly similar to those in the Windowsversion. You also see a few icons; these enable you to perform a few tasks but are not especially useful.

Under these icons is a search box, which is useful as an alternative to typing in help commands you look at getting help shortly. Getting to know where help is available is a good starting point, and that is the subject of the next section.

A lot of material is available for help with R and tracking down the useful information can take a while. Of course, this book is a good starting point! In the following sections you see the most ef ficient ways to access some of the help that is available, including how to access additional libraries that you can use to deal with the tasks you have. Youll also find some useful beginners guides in the Contributed Documentation section. Different authors take different approaches, and you may find one suits you better than another.

Try a few and see how you get on. Additionally, preferences will change as your command of the system develops. There is also a Wiki on the R website that is a good reference forum, which is continually updated. NOTE Remember that if you are searching for a few ideas on the internet, you can add the word CRAN to your search terms in your favorite search engine adding R is also useful. This will generally come up with plenty of options. The Help Command in R R contains a lot of built-in help, and how this is displayed varies according to which OS you are using and the options if any that you set.

The basic command to bring up help is: help topic Simply replace topic with the name of the item you want help on. You can also save a bit of typing by prefacing the topic with a question mark, like so:? This works for all the different operating systems. Of course, you need to know what command you are looking for to begin with. If you are not quite sure, you can use the following command: apropos partword Finding Your Way with R 11 This searches through the help fi les for matches to the word you typed, you replace partword with the text you want to search for.

Note that unlike the previous help command you do need the quotes single or double quotes are fi ne as long as they match. Help for WindowsUsers The Windowsdefault help generally works fi ne see Figure , but the Index and Search tabs only work within the section you are in, and it is not possible to get to the top level in the search hierarchy. If you return to the main command window and type in another help command, a new window opens so it is not possible to scroll back through entries unless they are in the same section.

The help window acts like a browser and you can use the arrow buttons to return to previous topics if you follow hyperlinks. You can also type search terms into the search box. Scrolling to the foot of the help entry enables you to jump to the index for that section Figure Once at the index you can jump further up the hierarchy to reach other items.

The top level you can reach is identical to the HTML version of the help that you get if you type the help. If you return to the main command window and type another help item, the original window alters to display the new help. You can return to the previous entries using the arrow buttons at the top of the help window. Finding Your Way with R 13 Help for Linux Users Help in Linux is displayed by default as plain text and appears in the Terminal window, temporarily blotting out what was displayed previously see Figure When you are fi nished, hit the Q key and return to the Terminal window.

Although at this point you will not really know any R commands, it is a useful time to look at a speci fic command to illustrate the help feature. In this example you look at the mean command. As you may guess, this determines the arithmetic mean of a set of numbers.

Try the following: 1. First, type in the following command: help. Click the Packages link and then click the base link. Navigate your way down to the mean command and look at the entry there.

Navigate back to the first page and use the Search Engine link to search for the mean command. You will see several entries, depending on which additional packages are installed. Select the base::mean entry in this case, which brings up help for the command to determine the arithmetic mean.

Take a look at a speci fic example of a help window here using the mean command again. You start by bringing up the help item for this command.

You can type one of the following: help mean? In any event you will get a help entry that looks like Figure You also learn how to access the built-in help system and find out about additional packages of useful analytical routines that you can add to R.

This chapter builds some familiarity with working with R, beginning with some simple math and culminating in importing and making data objects that you can work with and saving data to disk for later use. This chapter deals with manipulating the data that you have created or imported. These are important tasks that underpin many of the later exercises. The skills you learn here will be put to use over and over again. This chapter is all about summarizing data.

Here you learn about basic summary methods, including cumulative statistics. You also learn how about cross-tabulation and how to create summary tables. In this chapter you look at visualizing data using graphical methods—for example, histograms—as well as mathematical ones. This chapter also includes some notes about random numbers and different types of distribution for example, normal and Poisson.

In this chapter you learn how to carry out some basic statistical methods such as the t-test, correlation, and tests of association. Learning how to do these is helpful for when you have to carry out more complex analyses and also illustrates a range of techniques for using R. In this chapter you learn how to produce a range of graphs including bar charts, scatter plots, and pie charts.

Beginning R: The Statistical Programming Language (Preview Sample)

As your analyses become more complex, you need a more complex way to tell R what you want to do. This chapter is concerned with an important element of R: The chapter has two main parts; the first part shows how the formula notation can be used with simple situations. The second part uses an important analytical method, analysis of variance, as an illustration.

This is an important chapter because the ability to define complex analytical situations is something you will inevitably require at some point. This chapter builds on the previous one. Now that you have seen how to define more complex analytical situations, you learn how to make and rearrange your data so that it can be analyzed more easily.

This also builds on knowledge gained in Chapter 3. In many cases, when you have carried out an analysis you will need to extract data for certain groups; this chapter also deals with that, giving you more tools that you will need to carry out complex analyses easily.

This chapter is all about regression. It builds on earlier chapters and covers various aspects of this important analytical method. You learn how to carry out basic regression as well as complex model building and curvilinear regression. It is also important because it illustrates some useful aspects of R for example, how to dissect results.

The later parts of the chapter deal with graphical aspects of regression, such as how to add lines of best-fit and confidence intervals. This chapter builds on the earlier chapter on graphics Chapter 7 and also from the previous chapter on regression. It shows you how to produce more customized graphs from your data. For example, you learn how to add text to plots and axes, and how to make superscript and subscript text and mathematical symbols.

You learn how to add legends to plots and how to add error bars to bar charts or scatter plots. Finally, you learn how to export graphs to disk as high-quality graphics files, suitable for publication. In this chapter you learn how to start producing customized functions and simple scripts that can automate your workflow and make complex and repetitive tasks a lot easier.

The book includes many examples and these are included in the Beginning. RData file. You can download that file by clicking on the link.

This one file contains all the example datasets and scripts you need for the whole book. Once you have the file on your computer you can load it into R by one of several methods:. If you have Windows or Macintosh you can load the file using menu commands or use a command typed into R:. The Beginning. RData file must be in your default working directory and if it is not you must specify the location as part of the filename.

Alternatively you can find the working directory in R by using the getwd command:. Then drag the Beginning. RData file into that directory and use the load command:. R uses named objects so everything gets a name. You can see what is included in the Beginning.

RData file by using the ls command:. This will show you everything currently in the memory of R. Remember that names are case sensitive so that Qty is not the same as qty. There are four main kinds of object in the Beginning. RData file:. Many of the objects in the Beginning. RData file are data.

For example the bv object shows some results for visits of bees to various colors of flower. These data are used to carry out a Goodness of fit test by comparing the observed visits to the theoretical ratio expected. Some of the objects in the Beginning. RData file are results. For example the pw. R is very flexible and one useful aspect is the ability to create simple functions. For example the pn object is a function that applies a polynomial formula to any numerical value.

In this case the polynomial formula was taken from a previous analysis and is used to draw a line of best-fit onto a graph. If you require a more complex task or want to automate your workflow, you can create a longer "script". The cum.