South Side Rob

  • Content count

  • Joined

  • Last visited

Community Reputation

0 Poker-Face

About South Side Rob

  • Rank
  1. Create my own Excel Conversion Template

    I don't know what their original format is. It's a service I pay for and they only make their reports in PDF format. I'm trying to parse out this data and insert it into a database. The closest program that Wondershare has that can handle these pdf files has been PDF Converter. I have written custom program code that maps every data point. My instruction code continues to grow as I find brand new scenarios in every file. Years ago, I used to use a program from Monarch but they went commercial now and they want $1,500 for their software for just a 12-month license. I appreciate you looking into this. It doesn't necessarily have to go into Excel. I just need it to get into a format that I can read programmatically. Word is probably the closest my pdf's come to being readable but I haven't been able to figure it out yet...
  2. Create my own Excel Conversion Template

    Daphne, something tells me you did not open the PDF files in question. Just one of the PDF files has just over 10,000 data points. If I custom draw all 10,000 data points, there is no guarantee that the next file will have 10,000 data points. It might only have 8,000 or even 12,000. Again, this has to do with having variable rows in variable sections on each PDF page, which, also will be a variable amount.
  3. Create my own Excel Conversion Template

    Also, the data extraction function would be great if my PDF's had fixed locations. My PDF's are broken into let's say 4 sections. The first section is the race header which is somewhat fixed in its location but can sometimes include more rows for races that have many additional payout rows. The second section is the pace line rows for each horse that ran in the race. This can be from 4 rows to 20 rows. The columns are fixed but the rows are variable and the PDF does not reserve blank rows when there are fewer the 20 pace line rows. The third section are the top times ran by the horses in the past. Most times, this is only 10 rows (the columns are fixed) but sometimes, there may be less rows for horses who are just starting out. The last section if the Jockey/Trainer records where I receive one row for each horse that ran in the race. Same as the pace lines, there can be as few as 4 rows and as many as 20. Large races (Races with 10 or more horses), this data cannot fit on one page, thus, a 2nd page is needed to provide all the information. In summary, these PDF's have fixed positioning for the columns but again, have a variable amount of rows and that is where I think the data extraction piece would fall short. As far as I can tell, the data extraction expects the same amount of columns and the same amount of rows...
  4. Create my own Excel Conversion Template

    The data extraction piece works but the problem I have is that each page and each file has a variable set of rows to extract. As an example of the two files I attached, the first race on file DED-2018-0105.PDF, (which is on page one) has 7 rows of data I need to extract (one row for each horse). On page two of the same file, there are 8 rows of data I need to extract. Knowing how many rows of data extraction I need from file to file and from page to page is almost never the same. For this reason, I don't think the data extraction piece can be effective... One work-around I've considered, is saving each race as its own file and having a different data extraction file for each race based on the number of rows that need to be extracted but splitting out these PDF files will be too time-consuming. Right now, I've been using the PDF Converter by Wondershare and writing custom VBA scripts where it is checking each cell to see if multiple values landed in the same cell and splitting them out.
  5. Create my own Excel Conversion Template

    Sure. After reading as much of the manual as I could, my PDF files are already populated so I will not be creating my own PDF files. I'm trying to extract the data from them into an Excel Workbook. The closest method that works is using the PDF converter rather than PDF element, however, I still have data that "runs into each other" instead of values in their own cells. I tried using the Form > Data Extraction task on PDF Element, but my problem is that each PDF file has about 5-15 pages of variable rows. Attached are two examples. 1 has 15 pages and 1 has 9 pages. When you have the two PDF's open, you will notice that some races have, say 7 starters (1 row of data for each horse) and the other PDF for page 1 which is also race 1 for that track is only running 6 starters. None of my PDF's are fixed (row-wise) but the fields across are. Thank you in advance for your help... DED-2018-0105.PDF GG-2018-0105.PDF
  6. Is there a way I can create my own template to convert my PDF's to Excel? The automated steps available does not work so well. It puts data that should be on the same row on different rows. It combines multiple pieces of data and places them in the same cell etc. I know how to use the Data Extraction piece which is nice but I'd like to be able to create a template instead if possible. Please let me know thanks.