+ Reply to Thread
Results 1 to 27 of 27

Importing text using macros to identify tags and merge data into columns

Hybrid View

abousetta Importing text using macros... 04-10-2010, 03:27 PM
shg Re: Importing text using... 04-10-2010, 04:51 PM
abousetta Re: Importing text using... 04-10-2010, 05:31 PM
shg Re: Importing text using... 04-10-2010, 08:31 PM
abousetta Re: Importing text using... 04-10-2010, 11:31 PM
abousetta Re: Importing text using... 04-10-2010, 11:42 PM
abousetta Re: Importing text using... 04-10-2010, 11:49 PM
abousetta Re: Importing text using... 04-10-2010, 11:50 PM
abousetta Re: Importing text using... 04-11-2010, 06:36 AM
shg Re: Importing text using... 04-11-2010, 09:31 AM
abousetta Re: Importing text using... 04-11-2010, 10:36 AM
shg Re: Importing text using... 04-11-2010, 10:40 AM
shg Re: Importing text using... 04-11-2010, 10:45 AM
abousetta Re: Importing text using... 04-11-2010, 01:42 PM
abousetta Re: Importing text using... 04-12-2010, 07:22 PM
shg Re: Importing text using... 04-13-2010, 09:24 PM
abousetta Re: Importing text using... 04-13-2010, 11:00 PM
shg Re: Importing text using... 04-13-2010, 11:42 PM
  1. #1
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Importing text using macros to identify tags and merge data into columns

    I have been struggling with a problem that I hope someone can help me with. (Background: I am a novice at VBA and so please bear with me with some of the more complex syntax)

    I have an output (attached txt file) that contains a series of tags for each reference (citaiton). The number of tags may vary from one reference to the next, but will always be preceded by a return carriage (blank line) and end with a return carriage. All tags are CAPITAL letters and contain six characters. The first tag will always be "TY - " and the last tag will always be "ER - " .

    What I am trying to do is create a new excel sheet with the information (identified by specific tags) merged into one cell. In the dummy excel sheet (attached) I demonstrate that I want to put in the first column the data from the reference id tag ("ID - ") and in the column next to it a merge of tags ("A1 - ","JF - ","Y1 - "), but without the three backslashes ("///") (i.e. just the year without month or day).

    Any ideas will be highly beneficial.

    Thanks.

    Abousetta
    Attached Files Attached Files
    Last edited by abousetta; 04-13-2010 at 11:02 PM.

  2. #2
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Importing text using macros to identify tags and merge data into columns

    Try this.
    Option Explicit
    
    Sub x()
        Dim sFile       As String
        Dim iFF         As Integer
        Dim sInp        As String
        Dim asDat(1 To 3) As String
        Dim iRow        As Long
    
        sFile = Application.GetOpenFilename("Text files, *.txt")
        If sFile = "False" Then Exit Sub
        
        iFF = FreeFile
        Open sFile For Input As #iFF
    
        Do While Not EOF(iFF)
            Input #iFF, sInp
            
            Select Case Left(sInp, 6)
                Case "ID  - "
                    iRow = Cells(Rows.Count, "A").End(xlUp).Row + 1
                    Cells(iRow, "A").Value = Mid(sInp, 7)
                Case "A1  - "
                    asDat(1) = Mid(sInp, 7)
                Case "JF  - "
                    asDat(2) = Mid(sInp, 7)
                Case "Y1  - "
                    asDat(3) = Replace(Mid(sInp, 7), "/", "")
                Case vbNullString
                    If iRow <> 0 Then Cells(iRow, "B").Value = Join(asDat, ", ")
                    Erase asDat
            End Select
        Loop
    
        If IsEmpty(Cells(iRow, "B")) Then Cells(iRow, "B").Value = Join(asDat, ", ")
        Close iFF
    End Sub
    Entia non sunt multiplicanda sine necessitate

  3. #3
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    Thank you so much shg. It worked and I have studying the code to understand how to better myself as a vba developer.

    I have run the macro in a large dataset and it generally work perfectly. There are a few minor situations in which the code seems to not work correctly (and its not the macro's fault) because the data is missing or presented differently.

    The first situation is when there are several tags of the same kind (e.g. multiple authors and each is identified by the tag "A1 - ". Is there a way for it to only use the first instance of the tag and ignore the rest.

    The second situation is when the date is presented as not as just the year, but also the month and date (e.g. 2010/04/10). Is there a way to only include the year (i.e. first four intigers).

    Last situation I found that is problematic is when the information is possibly located in several different tags. This is similar to the first problem with the author names, but the journal name could be presented in one of several tags (or presented in different ways in several tags). I only want the journal name as presented in the first tag, but if not present I would like the macro to look for the name in one of the other tags. The tags is order of importance are "JF - " , "JO - " , "JA - " , "J1 - " , "J2 - ".

    Sorry for not noticing these possible bugs from the start.

    Your efforts are highly appreciated.

    Best wishes,

    Abousetta

  4. #4
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Importing text using macros to identify tags and merge data into columns

    Untested:
    Option Explicit
    
    Sub x()
        Dim sFile       As String
        Dim iFF         As Integer
        Dim sInp        As String
        Dim asDat(1 To 3) As String
        Dim iRow        As Long
    
        sFile = Application.GetOpenFilename("Text files, *.txt")
        If sFile = "False" Then Exit Sub
        
        iFF = FreeFile
        Open sFile For Input As #iFF
    
        Do While Not EOF(iFF)
            Input #iFF, sInp
            
            Select Case Left(sInp, 6)
                Case "ID  - "
                    iRow = Cells(Rows.Count, "A").End(xlUp).Row + 1
                    Cells(iRow, "A").Value = Mid(sInp, 7)
                
                Case "A1  - "
                    If Len(asDat(1)) = 0 Then asDat(1) = Mid(sInp, 7)
                
                Case "JF  - "
                    asDat(2) = sInp
                
                Case "JO  - "
                    Select Case Left(asDat(2), 6)
                        Case "JA  - ", "J1  - ", "J2  - ", vbNullString
                            asDat(2) = sInp
                    End Select
                
                Case "JA  - "
                    Select Case Left(asDat(2), 6)
                        Case "J1  - ", "J2  - ", vbNullString
                            asDat(2) = sInp
                    End Select
                
                Case "J1  - "
                    Select Case Left(asDat(2), 6)
                        Case "J2  - ", vbNullString
                            asDat(2) = sInp
                    End Select
                
                Case "J2  - "
                    Select Case Left(asDat(2), 6)
                        Case vbNullString
                            asDat(2) = sInp
                    End Select
                    
                Case "Y1  - "
                    asDat(3) = Mid(sInp, 7, 4)
                    
                Case vbNullString
                    asDat(2) = Mid(asDat(2), 7)
                    If iRow <> 0 Then Cells(iRow, "B").Value = Join(asDat, ", ")
                    Erase asDat
            End Select
        Loop
        
        If IsEmpty(Cells(iRow, "B")) Then
            asDat(2) = Mid(asDat(2), 7)
            Cells(iRow, "B").Value = Join(asDat, ", ")
        End If
    
        Close iFF
    End Sub

  5. #5
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    Hi shg,

    Thanks for all your help. I have run the code on real time from a large database and I have come across a bug which I can't explain. In the attached examples, the macro should get the author name and date but for some reason returns blanks. The weird thing is that with similar references they seem to return the information. The only thing I can tell that seems to be the same in all of them is that the date is only four digits and the backslashes for the missing month and year (e.g. 2010///). What I don't understand is that with other references, this is not an issue and so I don't know how to circumvent this problem.

    Thanks again for all your help.

    Abousetta

    N.B. Sorry that I uploaded the file as .doc file instead of .txt because the file was too big to upload as txt

  6. #6
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Importing text using macros to identify tags and merge data into columns

    No attachment.

  7. #7
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    Also, as I am trying to self-teach myself vba, could you please tell me what function sInp has? I have tried to google it but have not found an explaination. I have only found it used in several macros, but I still don't understand what it is doing.

    Thanks.

    Abousetta

  8. #8
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    Sorry for the confusion. Here is a text file with the actual output with the problematic citations. I have provided a samples since the upload tool will only allow me to upload a txt no larger than 100kb. That was the reason I originally saved it as doc because it was around 500kb.

    Abousetta

  9. #9
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    the under 100kb text file

    Abousetta
    Attached Files Attached Files

  10. #10
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    Hi shg,

    I have been investigating the possible source of the bug and I think I might understand now where the problem lies. As far as I can tell, there are two sources that are causing the isolated incidences:

    1) Some of the data in the same tag is seperated by a carriage return.
    2) Some of the tags are not six characters, but instead five characters

    These two issues seem to be confusing the macro. Is there a way to specifically only look at information that begins after a carriage return, but has the exact tag details that I described previously.

    The first attachement (Updated Bugs.txt) contains three examples that have the problem. The second attachment (Updated Bugs (problem solved).txt) shows that once I removed the problematic areas from the text file, it works perfectly.

    Thank you again for all your time and consideration.

    Abousetta
    Attached Files Attached Files

  11. #11
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Importing text using macros to identify tags and merge data into columns

    Try this. It uses the TY tag as an indicator of a new entry.
    Option Explicit
    
    Sub x()
        Dim sFile       As String
        Dim iFF         As Integer
        Dim sInp        As String
        Dim asDat(1 To 3) As String
        Dim iRow        As Long
    
        sFile = Application.GetOpenFilename("Text files, *.txt")
        If sFile = "False" Then Exit Sub
        
        iFF = FreeFile
        Open sFile For Input As #iFF
    
        Do While Not EOF(iFF)
            Input #iFF, sInp
            
            Select Case Left(sInp, 6)
                Case "TY  - "
                    asDat(2) = Mid(asDat(2), 7)
                    If iRow <> 0 Then Cells(iRow, "B").Value = Join(asDat, ", ")
                    Erase asDat
                
                Case "ID  - "
                    iRow = Cells(Rows.Count, "A").End(xlUp).Row + 1
                    Cells(iRow, "A").Value = Mid(sInp, 7)
                
                Case "A1  - "
                    If Len(asDat(1)) = 0 Then asDat(1) = Mid(sInp, 7)
                
                Case "JF  - "
                    asDat(2) = sInp
                
                Case "JO  - "
                    Select Case Left(asDat(2), 6)
                        Case "JA  - ", "J1  - ", "J2  - ", vbNullString
                            asDat(2) = sInp
                    End Select
                
                Case "JA  - "
                    Select Case Left(asDat(2), 6)
                        Case "J1  - ", "J2  - ", vbNullString
                            asDat(2) = sInp
                    End Select
                
                Case "J1  - "
                    Select Case Left(asDat(2), 6)
                        Case "J2  - ", vbNullString
                            asDat(2) = sInp
                    End Select
                
                Case "J2  - "
                    Select Case Left(asDat(2), 6)
                        Case vbNullString
                            asDat(2) = sInp
                    End Select
                    
                Case "Y1  - "
                    asDat(3) = Mid(sInp, 7, 4)
                    
            End Select
        Loop
        
        If IsEmpty(Cells(iRow, "B")) Then
            asDat(2) = Mid(asDat(2), 7)
            Cells(iRow, "B").Value = Join(asDat, ", ")
        End If
        
        Close iFF
    End Sub
    Last edited by shg; 04-11-2010 at 10:46 AM. Reason: Removed the debug statement

  12. #12
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    Thank you once again shg...

    I have run the updated code you sent and so it works like a charm.

    Have a great weekend.

    Abousetta

  13. #13
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Importing text using macros to identify tags and merge data into columns

    You're welcome, glad we got it sorted.

  14. #14
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Importing text using macros to identify tags and merge data into columns

    ... I was wondering what the function of sInp is? I have seen it used in other macros but can't find an explanation to its use.
    It's just the string variable that receives one line of the file from the Input statement.

  15. #15
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    Thanks shg... you have been very helpful and patient with me.

    If possible, I have found one (hopefully) last issue I hope you will be able to help me resolve.

    The code works flawlessly according to the guidelines I have previously described. But I have another situation, and I have not been able to modify the code correctly to run correctly. The situation is related to the titles of the reference. The references could contain a comma or could be divided over several tags (all which will be "T1 - ". The problems are:

    1) Once the code reaches the comma in the title, it stops that line and moves to the next tag, but the comma is part of the title and not a stopper

    2) The title can be divided into several tags with the same six character id "T1 - ". I would like to include all the data from all the tags in a sequential order with a semicolon between them. Is this possible.

    Apologies for not picking up these problems earlier. I have been trying all morning to modify your code so I don't have to bother you again, but the student is still learning from the master .

    Thanks again for everything.

    Abousetta

    P.S. attached is one example where both problems exist
    Attached Files Attached Files

  16. #16
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Importing text using macros to identify tags and merge data into columns

    The data extracted is author (A1), journal (JF, JO, JA, ...), and date (Y1). The T1 tag is not processed at all. In the last example you posted, there is no journal tag, ergo no entry for journal.
    Last edited by shg; 04-11-2010 at 02:14 PM.

  17. #17
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    Hi shg,

    I have posted the example as requested.

    Thanks again for all your help.

    Abousetta

  18. #18
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Importing text using macros to identify tags and merge data into columns

    abousetta,

    Here's a modification to capture all the key words.

    The data assembly is getting increasingly random, and begs a completely different approach, like a data-driven grammar. I'm going to leave that to someone else.
    Attached Files Attached Files

  19. #19
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Importing text using macros to identify tags and merge data into columns

    Thank you for all your help. I know that I have been troublesome, but I am truly grateful for all your help.

  20. #20
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Importing text using macros to identify tags and merge data into columns

    Not troublesome at all -- just more requirements creep than I have time for. You did a good job making incremental changes to the code.

    By data-driven, grammar-based, I meant something where the row header has strings that provide parsing information that the software can read, to know what tags to collect for the column and how to assemble them. Once implemented, the row headers would control the data assembly entirely, without modifying the code. Columns could be added and deleted at will.

    Creating that grammar would require a comprehensive set of cases, a day or so of head-scratching, and then the coding.

    Best of luck.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1