Playing with Gravity to Learn CUDA: An N-Body Simulation

Now that we have a working simulation engine and a real-time display of a running simulation, it's time to see what we can do with this gravity simulator we've been building in CUDA. We'll start off with a complete simulation of the solar system to see if we can get a reasonable multi-body system simulating correctly. Then we'll move on to filling up the 640 cores on my GeForce GTX 1050 card to make full use of the GPU and see where the base limits are for this simulator. This should be fun.

The Solar System

Before we're able to run a simulation of the whole solar system, we need to know where all of the planets are at a specific time and what their velocity vectors are. Just knowing their average distance from the Sun and their average speed isn't going to cut it in this case because the planetary orbits are not exactly circular and each planet's mass will influence the others. If we start them all off on the x axis for the initial conditions, the simulation will be something, but it won't be as accurate of a depiction of the real solar system as it could be.

For this data Wikipedia is not enough, so instead I turned to the JPL HORIZONS ephemeris website. Ephemeris is a fancy word for a table of orbital data for objects in the solar system. JPL keeps track of over a million different asteroids, comets, and the planets, which is just incredible to think about, but we're only concerned with ten of those objects. I grabbed the X-Y positions and velocities for the Sun, the planets, and Pluto all at the same time and day in km/s. I intentionally ignored the Z-axis data since we're not using that in the simulator. All of the bodies are assumed to be in the same plane, which isn't entirely accurate but should be close enough, even for Pluto because it's mass is so much smaller than the nearest planet, Neptune, and so it won't have much effect on the rest of the system. I also got the masses of each body from the JPL planet physical characteristics table page.

Since the Sun starts at the origin in this configuration of the data, I needed to figure out the momentum of the rest of the system so that I could set the Sun's velocity to compensate for it. Otherwise, the system would slowly drift in the direction of that momentum. I put the velocity and mass data in a spreadsheet and calculated the velocity that the Sun should have with the following equation:

v_Sun = -(Σm_iv_i) / m_Sun

Where v stands for the velocity vector of each object and m stands for mass. With the compensating velocity of the Sun calculated, we can plug in the masses, positions, and velocities of each body in our arrays for initial conditions in main(), being careful to convert km to meters for the positions and velocities:

int main(int argc, char** argv) {
    initGL(&argc, argv);

    const float m[_num_bodies] = { 
        0.0,        // Origin
        1.989e30,   // Sol
        3.302e23,   // Mercury
        4.8685e24,  // Venus
        5.9724e24,  // Earth
        6.4171e23,  // Mars
        1.8981e27,  // Jupiter
        5.6832e26,  // Saturn
        8.6813e25,  // Uranus
        1.0241e26,  // Neptune
        1.307e22    // Pluto
    };
    const vec3d d0[_num_bodies] = { 
        {0, 0, 0, 1},                                          // Origin
        {1, 0, 0, 1},                                          // Sol
        {3.661980629929986E+10, 3.055470346266923E+10, 0, 1},  // Mercury
        {-5.873169609724162E+9, -1.085620928662236E+11, 0, 1}, // Venus
        {-7.999642430933115E+10, 1.236118155911581E+11, 0, 1}, // Earth
        {5.102717483955609E+10, 2.243131860385265E+11, 0, 1},  // Mars
        {4.743960342783579E+11, -5.952396785448154E+11, 0, 1}, // Jupiter
        {8.356241104235727E+11, -1.237809014804413E+12, 0, 1}, // Saturn
        {2.288075301350948E+12, 1.873673832632885E+12, 0, 1},  // Uranus
        {4.408978859574347E+12, -7.724145243644576E+11, 0, 1}, // Neptune
        {2.113117481871498E+12, -4.659161032056704E+12, 0, 1}  // Pluto
    };
    const vec3d v0[_num_bodies] = { 
        {0, 0, 0, 1},                                         // Origin
        {1.159902211052130E+01, 1.035640015302160E+01, 0, 1}, // Sol
        {-4.080026420194861E+4, 3.948947880398944E+4, 0, 1},  // Mercury
        {3.473509242264790E+4, -2.025387845405878E+3, 0, 1},  // Venus
        {-2.548479662572985E+4, -1.630237154083575E+4, 0, 1}, // Earth
        {-2.270904598757260E+4, 7.431844827548277E+3, 0, 1},  // Mars
        {1.007104907684151E+4, 8.767860631883241E+3, 0, 1},   // Jupiter
        {7.478797389642545E+3, 5.389266802048683E+3, 0, 1},   // Saturn
        {-4.355463670767629E+3, 4.959991511134584E+3, 0, 1},  // Uranus
        {9.122491384080702E+2, 5.395758658250224E+3, 0, 1},   // Neptune
        {5.072923308461469E+3, 1.070490921773572E+3, 0, 1}    // Pluto
    };
    
    init(m, v0, d0);

    glutDisplayFunc(display);
    glutReshapeFunc(reshape);
    glutIdleFunc(idle);

    glutMainLoop();

    finalize();

    return 0;
}

Notice that the position for the Sun was offset by 1 meter in the x direction, otherwise we would have a zero-distance situation between the Origin and the Sun, which would make the Sun disappear in the calculations. Now we need to adjust the scale and timestep for viewing the whole solar system because it's currently set up for a field of view that extends out to the Earth. Setting the _scale to 9.0e12 and _dt to 9000 worked fairly well for seeing the whole solar system:

Clearly, the inner planets move much faster than the outer planets, and the timestep was set high enough to see some motion in the outer planets without having to wait forever. Meanwhile, the inner planets are whipping around the Sun many times in this one and a half minute video. If we instead want to take a closer look at the inner planets out to Mars, we can set _scale to 9.0e11 and _dt to 1800 for a time step of 30 minutes.

With this view, we can see the four inner planets clearly, and Jupiter makes an appearance on the right side of the video in the beginning and then reappears in the upper left corner near the end of the video. It appears that the simulation is working the way it's supposed to for the solar system, which is promising and exciting.

Scaling Up N

Now that we have a working simulation with 10 bodies, let's try scaling that way up. We're not going to try to recreate the asteroid belt or anything in this case. That would be way too tedious. Instead, let's take a different tack and generate bodies with random mass and position. We'll start them all off with zero velocity for this attempt. We can add a function for initializing these bodies as follows:

void initBodies(float* m, vec3d* v, vec3d* d) {
    // set origin
    m[0] = 0.0;
    d[0] = { 0, 0, 0, 1 };
    v[0] = { 0, 0, 0, 1 };
    m[1] = _scale * _scale * 100;
    d[1] = { 1, 0, 0, 1 };
    v[1] = { 0, 0, 0, 1 };

    for (int i = 2; i < _num_bodies; i++) {
        m[i] = rand() * _scale / 2 + _scale / 2;
        m[i] *= m[i];

        d[i].x = (rand() / (float)RAND_MAX - 0.5) * _scale * 2;
        d[i].y = (rand() / (float)RAND_MAX - 0.5) * _scale * 2;
        d[i].z = 0.0;
        d[i].a = 1.0;

        v[i].x = 0.0;
        v[i].y = 0.0;
        v[i].z = 0.0;
        v[i].a = 1.0;
    }
}

The function takes arguments of arrays for the mass, velocity, and position, and it's going to fill these arrays with the random initial conditions for each of the bodies. I try to set a zero-mass origin and a large central mass that's slightly offset from it, but it turns out that there's enough other mass in the configuration that the central mass doesn't act as the intended anchor and instead slowly flies off, taking the origin point with it. Heh, heh. Oh, well, it was worth a try. You can see how the for loop creates random bodies with masses uniformly in the range of [_scale/2, _scale] and positions uniformly distributed around the viewable area established by _scale. I found this to create a relatively interesting simulation for viewing. The arrays can be initialized from main() by calling initBodies() like so:

int main(int argc, char** argv) {
    initGL(&argc, argv);

    float* m = new float[_num_bodies];
    vec3d* d0 = new vec3d[_num_bodies];
    vec3d* v0 = new vec3d[_num_bodies];
    initBodies(m, v0, d0);
    
    init(m, v0, d0);

    glutDisplayFunc(display);
    glutReshapeFunc(reshape);
    glutIdleFunc(idle);

    glutMainLoop();

    finalize();

    return 0;
}

With this setup, 1024 bodies, the scale set to 9.0e12, and the timestep set at 900 we get a simulation like the following:

First, it should be noted that the 1024 bodies is more than the number of cores in my GTX 1050 card, but CUDA takes care of allocating the threads to cores automatically. The maximum number of threads in a block is 1024, and we're currently using a single block, so this is the maximum number of bodies for the simulation at the moment. It looks like the card can handle this number easily.

The other thing that should be immediately obvious from this simulation is that most of the bodies are ejected from the scene relatively quickly. That may seem surprising because we may expect with a starting velocity of zero, all of that mass would gravitate towards the center, but we should take some other things into consideration here. First, the initial setup is not at all like anything we see in space. We have an area about the size of our solar system that's uniformly filled with 1,024 objects that are about the mass of large asteroids, and none of them are moving. As soon as one of these bodies has a close interaction with another body, it is quite likely to accelerate significantly and get thrown out of the view and the system.

If we wanted to try to simulate a proto-planetary system, we would need to start with a ridiculously large amount of bodies that were on the scale of dust particles. That type of simulation at that scale isn't currently possible with this setup, and it might be impossible with a GPU of this performance level. If we wanted to sidestep that issue and start at a later time during the system's formation when larger bodies have already started to form, we would need to have the bodies moving in a generally circular direction around the center, and at least some of them would likely need to be much more massive. That simulation might be more possible with this GPU, but the initial conditions would need to be set with more intention to get a reasonable result. Also, the simulation engine has no way to combine bodies when collisions occur. It doesn't even detect collisions or have a sense of the physical size of the bodies, so that type of simulation wouldn't really work. Plus, the timescales we're talking about with these types of simulations would be millions of years, and the current timestep was half an hour. That would need some adjustment.

Another possible option for a more realistic simulation is trying to simulate a galaxy. In order to do that, we would have to scale way out in viewing area, mass of bodies, number of bodies, and timestep. Collisions are far less likely at the scale of galaxies, so that aspect could be safely ignored, but galaxies have hundreds of millions or billions of stars in them and timescales are at least in millions of years to see reasonable amounts of motion. We may be able to look at something like a star cluster, but a galaxy simulation might be out of the realm of possibility with the hardware I'm working with and the way the current engine is set up.

It may be worth looking into setting up a star cluster simulation for the next episode. Star clusters will have thousands to tens of thousands of stars, which is certainly within the realm of possibility for these simulations, and at least open star clusters will disassociate over time, just like this 1024-body simulation did. Having initial conditions of zero velocity or small random velocities may be a reasonable starting point for such a simulation, so that'll be the plan while we see what we can do about optimizing the simulation to enable a much higher number of bodies at once.

Lucid Mesh

Search This Blog