lundi 20 avril 2015

Comparing values yields different results

I have a script which reads data in from a csv into a pd dataframe. It then iterates each row and passes the row as a pd series to another module. Here, one of the columns is evaluated to see if it is bigger than a value contained in another pd series, eg:

df_1:

col_A, col_B, col_C
234.0, 563.2, 565.5
565.7, 324.3, 5676.4

df_2:

col_X, col_Y, col_Z
124.1, 763.5, 562.1

In the above example, the first row of the dataframe is selected and sent to a function which checks to see if df_1['Col_A'] (ie: 234.0) is bigger than df_2['col_X'] (ie: 124.1). This all works perfectly.

My problem comes now that I have changed the script to read in the original dataframe from a PostgreSQL db instead of a csv file. Everything else has remained the same. The comparison appears to be doing nothing,....it doesn't evaluate to True or False, it just skips the evaluation completely.

The original code to compare the two values (each contained in a pd series) which worked correctly when reading in from csv is:

if df_1['col_A'] > df_2['col_X']:
    #do something

I have checked the types of the two values both when reading in from csv and from postgresql. It is comparing:

<class 'float'> and <class 'numpy.float64'>

The values stored in the database are of type numeric(10,2).

I have tried the following to no avail:

if df_1.loc['col_A'] > df_2.loc['col_X']
and
if Decimal(df_1.loc['col_A']) > Decimal(df_2.loc['col_X'])
and
if abs(df_1.loc['col_A']) > abs(df_2.loc['col_X'])

Im completely stumped since the only thing that has changed is getting the data from a database instead of a csv. The resulting datatypes are still the same, ie: float compared against numpy.float64

Aucun commentaire:

Enregistrer un commentaire