Tuesday, June 30, 2015

What Not To Do

The summer is almost halfway over (oh my god where did the time go?!?!), which means the school year is about to start. And it's not just any school year; it's my senior year of college.  That means GREs, grad school applications, and graduation, but most importantly, it means writing my senior thesis.

Some of you may know that I'm writing a science fiction novel based on my research for my thesis.  Well, I'm quickly getting more and more worried about the fact that I've decided to undertake this giant project without ever having written any long creative pieces.


But, lucky for me, I happen to be surrounded by so many talented and generous people -- talented in that they know about writing and generous in that they're willing to lead my novice butt to something that might actually be good.  

One such person is Dr. (!) Sarah Rugheimer, who just got her Ph.D. in astrophysics from Harvard.  She sat down with me for over an hour last week and talked to me about my ideas.  It was amazing.  I went into that meeting with this vague, amorphous, kind-of-sort-of plan for a novel and walked out of it with a much clearer idea of what to write and how to go about writing it. 

I won't tell you the what now. You'll have to actually read my thesis to get that (see what I did there? that pre-publication advertising?).  But the how is something that everyone who wants to should learn, if they haven't already.  Here it is, in a few simple points:
  1.  Read (That's pretty obvious.  You can't be a good writer without being a good reader first.)
  2. Identify stories that are similar to yours and study them
  3. Plan out plot points. 
    • Use note cards or something else you can move around so you can play with the order of things. 
  4. Write without interruptions for short periods of time (\(\approx 25\) mins) with regular breaks at the end of each session
I spent tonight working on the first two. Dr. Rugheimer recommended a few books, one of which is The Sparrow by Mary Doria Russell.  I'm not too far along, so I can't really summarize it, but I can already tell it's really good.  If you're interested in science fiction and space and ethical quandaries, this is the book for you.

She also recommended the movie The Host, based off of the book of the same name by Stephenie Meyer (yes, the same woman who wrote Twilight).  The movie wasn't meant to be an example of great writing or plot development*.  The idea that I have for my novel was similar to the movie's plot, and now that I've watched it, I know they won't be identical.  I now even have ideas on how I can make my book better.  Maybe not in the eyes of preteen girls, because she'll probably always have me beat there, but that's not really my target audience anyway.  In other words, because of this movie, I have the beginning of a list of things I'm not going to do when I write my thesis, and that's as good a place as any to start.



*Disclaimer: The movie (and, I'm assuming, the book) is a lot better than Twilight!  The emotional climax of the movie actually made me cry, but that doesn't mean much because I cry at pretty much any movie.

Monday, June 29, 2015

P Cygni Profiles

P Cygni profiles are great diagnostic tools for anyone studying anything related to star formation. But, like with many astronomical tools and concepts, they aren't the easiest thing to research on the web with a simple Google search.  The knowledge I have now, the knowledge I'm about to share with you, was gathered over a couple of weeks from internet searches, textbooks, and conversations with professional astronomers.

What are P Cygni profiles?
P Cygni profiles are a spectral pattern named for P Cygni, a bright variable star in the constellation Cygnus. It's one of the most luminous stars in our galaxy (\(L = 610000 L_{sun}\)).  That's cool and all, but P Cygni is more than just a really bright star.  It has a massive outflow, which means matter is flowing away from it.  This outflow is the cause of the star's characteristic profile: a blueshifted absorption line and a redshifted emission line (I'll explain what those terms mean in the next section).

Example of a P Cygni profile from wikipedia's P Cygni page

 
Why do they exist?
The P Cygni profile is the result of the Doppler Effect.

Have you ever stood by as an ambulance passed you with its siren on? If you have, hopefully you'll remember what I'm about to describe. If you haven't, hopefully you have a good imagination. As the ambulance approaches you, the sound the siren makes gets higher-pitched. As it moves away from you, the sound gets lower. Why? Picture sound waves coming off of the siren and moving toward you through the air. When the truck is approaching you, the sound waves are getting pushed together, and because shorter wavelengths correspond to higher-pitched noises, the siren sounds higher. The opposite is true for the waves coming off of the back of the truck as it moves away from you.

That's the effect with sound, but it's more or less the same with light waves.



All of the information we get in astronomy comes from photons. In this case, photons are coming from the outflow and hitting our telescope. The gas along our line-of-sight (directly between us and the star), shown with the blue arrow, is being bunched together much like the sound waves in front of the ambulance, so the wavelengths of those photons become shorter. Any photons coming from gas that's moving away from us (shown with the red arrows) have slightly longer wavelengths than they would usually have.

Unlike sound, this doesn't result in a higher or lower pitch, or even in a lighter or darker color. This results in blue- and redshifted data, respectively. That just means that the features shift left (blueshifted) or right (redshifted) along the wavelength axis.

Now you might be thinking, "Okay, that explains why there are two bumps, but why does one of them go below the line and the other goes above it?"

Great question!  The one that goes below the line is called an absorption line.  This means that the photons we're seeing were absorbed by electrons on their way to us. In astronomy, we see absorption lines when there's cold gas between us and a hot source, or in this case, gas between us and a star.

The one that goes above the line is an emission line. This means that the photons were emitted by an electron.

How can we use them? 
So now you know what P Cygni profiles are and why they happen. That's only half the battle.  Now we have to understand what makes them so useful.

Imagine that, instead of one star, there were, let's say, billions of them. Hell, let's say we have a galaxy, or two. Let's go one step farther and say we have two galaxies and they're colliding. (We're done saying things now.)

When we observe these merging galaxies, we see a P Cygni profile, and that lets us know that there are outflows.  But outflows (or streams of matter flying away from an object) can be caused by more than just one phenomenon.  Star formation and Active Galactic Nuclei (black holes) both cause massive outflows. How do we figure out what's causing the outflows we see in our merging galaxies?

P Cygni profiles!  It turns out that AGN-driven outflows and star formation-driven outflows move at really different speeds.  We can find those speeds by measuring the intensities of the absorption and emission lines.

This is just one example of how P Cygni profiles can be used (it just happens to be the way that I'm using them in my research).  But no matter how you use them, P Cygni profiles have this tendency to provide both a question and its answer, and I think that's pretty damn beautiful.

Wednesday, June 24, 2015

Sigmas, and Errors, and Mus, Oh My!

If you do or read about science, you're (probably) familiar with the concept of error bars, but do you really know what they mean?  I'm going to try to explain it to you in this blog post.

Let's say you're reading a scientific paper and you see this value: \( 37.9 \pm 1.5\).  What does that mean? What does that little plus/minus tell you and why is it there?

That plus/minus exists because no measurement taken is ever exact. If someone were to measure your height several times, they wouldn't get the same result every time.  You're not growing and shrinking; there's just a source of error --in this case, human error-- in the measurement.  The plus/minus --or rather the number after it-- tells you how uncertain that particular measurement is.  In fact, it's sometimes called the uncertainty, or the sigma (\(\sigma\)), or the standard deviation.

A sigma is a standardized convention used in statistics to say how far away a certain measurement is from the mean, or mu (\(\mu\)).

Does this look familiar?


This is a gaussian curve, sometimes called a normal distribution. I'll tell you why it probably looks so familiar in a bit, but first I'll tell you what it means. 

The y-axis of the curve above is frequency; it tells you how many times a specific value (different measurement values run along the x-axis) was recorded.  Going back to the height example, the mean, or \(\mu\), is the average of all of the height measurements taken.  If someone asked you how tall you are, you would probably tell them this value, because it's the one that gets repeated the most.  Similarly, when scientists report a value without an error, they're likely reporting \(\mu\).  

I said before that \(\sigma\) refers to how far from the mean a value is, which you can see in the plot above.  If you move more than one \(\sigma\) away from \(\mu\), you are outside of 68.3% of the data.  That is the definition of 1 \(\sigma\): the step away from the mean that contains 68.3% of the data. The more sigmas away from the mean you are, the more data you're cutting out.

There is an equation that relates \(\mu\), \(\sigma\), and these percentages (p):

 $p = \frac{1}{\sigma \sqrt{2\pi}}e^{\left [ -\frac{1}{2}\left ( \frac{A-\mu}{\sigma} \right )^{2} \right ]}$

where A is a specific measurement of, for example, your height. 

In a specific form of this equation where \(\mu = 0\) and \(\sigma = 1\), the probability will tell you the percentages illustrated in the figure above. 

In its most general form, it gives you the probability that you will take a certain measurement given a specific \(\mu\) and \(\sigma\).  For example, the probability of a nurse telling you that you're 4' tall when you're usually measured to be 6'2" (\(\pm\) an inch or two) is really low.  

You might be able to tell by looking at the plot that the values of \(\mu\) and \(\sigma\) really affect the shape of the curve.  (Increasing \(\sigma\) makes the curve fatter, and this makes sense because more data will fall within that 68.3% cutoff.)  Because of this, and because the area under a gaussian is so easy to measure, gaussians are often used to "fit" or model data sets. 

This might seem kind of naive, assuming that a shape so simple can be used to model complex natural systems, but it's really not!  Gaussians actually occur all over the place in nature because of something called the Central Limit Theorem.  The theorem states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed.  

In other, simpler words, let's say a nurse measures your height 50,000 times. Each measurement is independent of all the others.  If you were to plot those height measurements against their frequency (how many times that specific value was recorded), it would look like a gaussian.

That's why the gaussian probably looked so familiar to you!  Because it exists all around you.  I mean, it's actually because gaussians are in all of the books on math, science, and statistics, but I like the first reason more.

Monday, June 22, 2015

The Good Ol' Guess and Check Method, Hardcore Science Style

Do you remember when you learned what pi was for the very first time? I don't, but I do remember wondering how people could have possibly come up with such a precise measurement of pi with such ancient tools. I still don't really know how they did it, but today I found pi myself, so my brain doesn't feel as obligated as it did to figure out the old guys' ways.

To find pi, I used what is called a Monte Carlo method.  There are many ways to describe and use a Monte Carlo, but the simplest way (and the way I used it today) is to pick random numbers and analyze where they "fall." I say "fall" because this was called the "dart throwing" method when we were given our instructions.  I'm going to walk you through how I threw my darts.

First, let's start with a unit square, which is a square where each side has length 1.


And let's add a circle in that square, because we know since we're dealing with pi that we probably need a circle.


If I start randomly throwing "darts" at this image, what is the probability that it will land in the circle? It's the area of the circle over the area of the square. This makes sense, right? The bigger the circle, the better my chances are of hitting the circle with a dart. (I'm terrible at darts, so I'd need a really big circle to even hit it once. Luckily, my computer is way better at darts than I am.) More precisely, the probability is

$Prob = \frac{A_{circ}}{A_{square}} = \frac{\pi}{4} \rightarrow \pi = 4\frac{A_{circ}}{A_{square}}$

But, remember, the circle and square are imaginary. We don't really know their areas.  We just know (or we can find out) how many points fall in each. So the probability becomes 

$Prob = \frac{N_{circ}}{N_{square}}$

where N is the number of points in the region of interest. 

How do I find the number of points in the circle and square? Well, the square's easy.  Every point falls within the square, because the points that are in the circle are also in the square. The circle is trickier.  I have to make sure that the center of the circle is at (0,0).  This can be done by generating your random point values on an interval from -R to R.  Next, test each point, or dart, to see if its distance from the center is greater or less than R.  If it's less than R, the point is in the circle. 


And that's it!  By throwing enough darts (like, around 10000 or a million), I can eventually get to a really accurate value for pi.  Sure, my entire calculation is based on prior knowledge of the relationship between pi and the area of the circle, but you gotta start somewhere, right?

Sunday, June 21, 2015

Like Sands Through the Hourglass, So Are the Days of My Weekend

I don't think I could have asked for a better weekend. Sure, I got next to no work done, but given the chance, I wouldn't change any of my time management decisions from the last two days.  This post probably won't have much science until the very end (and even then it won't be a lot), so if you want to read about my life, continue, but if you just want to get to the stuff about space, feel free to skip a few paragraphs.

It all started, as most weekends do, on Friday night. I had just closed my eyes for a nap when I got a call from my school-year roommate telling me that her best friend from high school was in town.  He lives in Georgia and I hadn't seen him since he came up to visit two years ago, so I promptly jumped right out of bed and ran over to see them. We spent the night talking and listening to music.  And then we got cheesy bread from Domino's.  It was a pretty great night.

On Saturday morning, I woke up to my alarm at 8:00 and realized that this was the first time since I moved into my summer housing that I had the option of sleeping as long as I wanted. Obviously I took it, and didn't wake up until after noon.  I probably would have slept longer but I got a text from my cousin saying that he was in town.  I woke up and got ready to meet him.  We had lunch and then went to meet his friends from college who live around Boston.  I only meant to spend a few hours with him/them, but ended up spending about nine. Oooops. But it was pretty fantastic. And I got some nice feedback on my ideas for my thesis from unbiased parties who don't have any reason to tell me things just to make me feel good!

Today was going to be the day I got stuff done. Did that happen? Not really.  Instead of working on my research in the morning, I slept. Instead of working on my research in the afternoon, I went all the way out to the suburbs of Boston to lose horribly to my cousin and his friends in Super Smash Bros. And instead of working on my research in the evening, I played --and won -- Spades with the rest of the Banneker group at my advisor's house.

Now it's Sunday night, and I haven't really done anything super productive, but I'm really okay with it. But just so I don't feel completely useless, I'm going to make and share a To Do list for this week.
  1. Finish testing the script that will take raw Kepler data and "detrend" it. I have no idea why it's called detrending. I just know that it takes raw Kepler data like this and subtracts the telescope's response to return a more-or-less flat line. 
  2. Learn how to run light curve fitting algorithms (the rigorous kind, not like what I did in Astro 16 with Excel)
  3. Start LaTeXing my report for the Exoplanet project
  4. Make a list of Observation IDs to process using HIPE and actually start processing them 
  5. Analyze the data that's already been processed. 

It's going to be a busy week, but I'm ready. 

Saturday, June 20, 2015

Danger: Galaxies Merging

We all make decisions that we know aren't the best for our health. This summer, I made the decision to do two *very different* research projects with two *very different* research groups.  You've already heard a lot about my first one dealing with exoplanets, but now I want to tell you a bit about the other one. The title above might give you a hint as to what it's about. 

I'm working with Drs. Howard Smith and Matthew on a sample of Ultra-Luminous Infrared Galaxies, trying to study the molecular outflows that happen when they merge.  I don't think I can say too much about this, or show any pictures, because that would basically be making our data and intentions public, but I can say a bit about what I've been doing.

Data Reduction. That's pretty much it. Every once in a while, I get to make a spreadsheet, but then I go right back to reducing data, initializing scripts that can take days to run.  I might sound bitter, but I actually don't mind.  (No sarcasm, I promise.)  Sure, it's not as exciting as the actual analysis that I'll get to do once this is all over, but taking raw data and manipulating it into a workable form is very satisfying. 

                         

This two-internship thing is more than just busy and slightly stressful.  It's confusing, too. I don't mean confusing in the sense that I have no idea what I'm doing with the research --though there definitely is some of that.  I mean confusing in the sense that my whole life plan has kind of been shaken.

I didn't want to go to grad school until about a year ago.  I didn't know I wanted to study astronomy until about two years ago.  So I guess I was due for another life-changing revelation, but it still took me by surprise.  What's changed?  I don't think exoplanets are boring anymore.  Maybe it's the project I'm working on, or the advisor I'm working on it with (no offense to other advisor; he's really great!), or the fact that I finally get to work with exoplanets outside of the classroom in a real research environment where I can more or less give the work my undivided attention. It's most likely a combination of all three. But whatever it is, it means that galaxies -- their formation and evolution -- aren't my only astronomical love anymore. 

Grad school, particularly choosing what I want to study once I get there, just got a whole lot harder, but at least now there's a better chance that I'll enjoy whatever I end up doing.