+ Reply to Thread
Results 1 to 4 of 4

pulling specific text from a notepad document

Hybrid View

  1. #1
    Registered User
    Join Date
    07-22-2009
    Location
    connecticut
    MS-Off Ver
    Excel 2007
    Posts
    6

    pulling specific text from a notepad document

    Hi everyone,

    I believe my problem is relatively simple

    I have a folder full of notepad documents containing source code from a web page.

    I would like to pull certain text from the source code into an excel file. The text I am trying to grab appears in virtually the same spot in every document, however it may not always be the same length.

    How can I do this in VB?

    Thanks in advance

  2. #2
    Forum Expert JBeaucaire's Avatar
    Join Date
    03-21-2004
    Location
    Bakersfield, CA
    MS-Off Ver
    2010, 2016, Office 365
    Posts
    33,492

    Re: pulling specific text from a notepad document

    The simplest method would be (shown in air code, you'll have to work out the steps):

    1) Import the text file into a new sheet
    2) Find SPECIFIC text that marks the beginning of the needed text
    3) Find SPECIFIC text that marks the end of the needed text
    4) Copy the found range of text to a permanent sheet
    5) Close the temporary sheet
    6) Repeat with all files in a folder.

    Don't ask "how do I do this?" without providing complete details. Upload your desired "results" workbook and several sample text files. make sure you give enough info someone can actually offer more help.
    _________________
    Microsoft MVP 2010 - Excel
    Visit: Jerry Beaucaire's Excel Files & Macros

    If you've been given good help, use the icon below to give reputation feedback, it is appreciated.
    Always put your code between code tags. [CODE] your code here [/CODE]

    ?None of us is as good as all of us? - Ray Kroc
    ?Actually, I *am* a rocket scientist.? - JB (little ones count!)

  3. #3
    Registered User
    Join Date
    07-22-2009
    Location
    connecticut
    MS-Off Ver
    Excel 2007
    Posts
    6

    Re: pulling specific text from a notepad document

    Here is a rough version of the code I'm working with. I've included the specific tags that surround the data I'm after (the <table> and <td> tags are not part of the specific tags, i just used them in the example code). These tags hold true throughout all of my data. There are multiple products in each text file.

    The result sheet I'm after is just an excel spreadsheet with 4 columns, one including the product name, the price, the time sold, and the date sold. Also I need this to run on all the text documents in the folder. The text documents are named 1.txt,2.txt,3.txt, etc. For this lets say they are all saved in a folder C:\webpages

    <html>
    <body>
    
    <table border="0" cellpadding="3px" cellspacing="3px">
    <tr>
    <td>
    	<span style="font-size: 1.1em; font-weight: bold;"><a href="tooth-brush.html">Tooth Brush</a>&nbsp;</span>
    </td>
    <td align="center">
    	<bold>05:00 PDT</bold>
    </td>
    <td>
    <td align="center">
    	<strong>$2.00</strong>
    </td>
    <td>
    	<div style="font-size: 0.7em;">08-06-2009</div>
    </td>
    </tr>
    </table>
    
    <table border="0" cellpadding="3px" cellspacing="3px">
    <tr>
    <td>
    	<span style="font-size: 1.1em; font-weight: bold;"><a href="cell-phone.html">Cell Phone</a>&nbsp;</span>
    </td>
    <td align="center">
    	<bold>06:02 PDT</bold>
    </td>
    <td>
    <td align="center">
    	<strong>$50.00</strong>
    </td>
    <td>
    	<div style="font-size: 0.7em;">08-06-2009</div>
    </td>
    </tr>
    </table>
    
    </body>
    </html>
    Last edited by smoosh1011; 08-06-2009 at 03:21 PM.

  4. #4
    Forum Expert JBeaucaire's Avatar
    Join Date
    03-21-2004
    Location
    Bakersfield, CA
    MS-Off Ver
    2010, 2016, Office 365
    Posts
    33,492

    Re: pulling specific text from a notepad document

    Make sure there are no other text files in that folder, give this a try with two or three files in there. It worked for me for finding the two items in 3 copies of the file, so it should work for any number of items if the layout stays consistent.
    Option Explicit
    Sub ImportHTMLTextProducts()
    'Open all .TXT in specific folder and import data
    'JBeaucaire (8/6/2009)     (2007 compatible)
    Dim fName As String, fPath As String, buf As String
    Dim Count As Long, NR As Long, cFind As Long
    Dim wNew As Workbook, v As Boolean
    
    Application.ScreenUpdating = False
    Application.EnableEvents = False
    Application.DisplayAlerts = False
    
    'Setup this report
    Set wNew = ThisWorkbook
    v = Evaluate("ISREF(Products!A1)")
        If Not v Then
            Worksheets.Add(After:=Worksheets(Worksheets.Count)).Name = "Products"
            Range("A1") = "Product Name"
            Range("B1") = "Price"
            Range("C1") = "Date"
            Range("D1") = "Time"
            With Range("A1:D1")
                .Font.Bold = True
                .Borders(xlEdgeBottom).LineStyle = xlContinuous
                .Borders(xlEdgeBottom).Weight = xlMedium
                .Interior.ColorIndex = 35
            End With
            Columns("B:B").NumberFormat = "$#,##0.00"
            Columns("C:C").NumberFormat = "m/d/yyyy"
    
            Range("A2").Select
            ActiveWindow.FreezePanes = True
            wNew.Activate
        Else
            Sheets("Products").Range("A2:D" & Rows.Count).ClearContents
        End If
    
    'Setup File Listing
        fPath = "C:\webpages\"
        If Right(fPath, 1) <> "\" Then fPath = fPath & "\"
        fName = Dir(fPath & "*.txt")
        NR = 2
    
    'Import Files
        Do While Len(fName) > 0
            Workbooks.OpenText Filename:=fPath & fName, Origin:=437, StartRow:=1, _
                DataType:=xlDelimited, TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, _
                Tab:=False, Semicolon:=False, Comma:=False, Space:=False, Other:=False, _
                FieldInfo:=Array(1, 2), TrailingMinusNumbers:=True
            cFind = Range("A:A").Find(What:="a href", After:=ActiveCell, LookIn:=xlFormulas, LookAt _
                :=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False, SearchFormat:=False).Row + 1
            Count = 0
                Do
                'Get Products
                    Range("A:A").Find(What:="a href", After:=ActiveCell, LookIn:=xlFormulas, LookAt _
                        :=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False, SearchFormat:=False).Activate
                    If ActiveCell.Row < cFind And Count > 0 Then Exit Do
                    buf = ActiveCell.Text
                    buf = Left(buf, InStr(buf, "</a>") - 1)
                    buf = Right(buf, Len(buf) - InStrRev(buf, ">"))
                    wNew.Sheets("Products").Range("A" & NR) = buf
                'Get Time
                    Range("A:A").Find(What:="PDT", After:=ActiveCell, LookIn:=xlFormulas, LookAt _
                        :=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False, SearchFormat:=False).Activate
                    buf = Left(ActiveCell.Text, InStr(ActiveCell.Text, "PDT") + 2)
                    wNew.Sheets("Products").Range("D" & NR) = Right(buf, 9)
                'Get Price
                    Range("A:A").Find(What:="$", After:=ActiveCell, LookIn:=xlFormulas, LookAt _
                        :=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False, SearchFormat:=False).Activate
                    buf = ActiveCell.Text
                    buf = Left(buf, InStr(buf, "</strong>") - 1)
                    buf = Right(buf, Len(buf) - InStrRev(buf, "$") + 1)
                    wNew.Sheets("Products").Range("B" & NR) = buf
                'Get Date
                    Range("A:A").Find(What:="</div>", After:=ActiveCell, LookIn:=xlFormulas, LookAt _
                        :=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False, SearchFormat:=False).Activate
                    buf = ActiveCell.Text
                    buf = Left(buf, InStr(buf, "</div>") - 1)
                    wNew.Sheets("Products").Range("C" & NR) = Right(buf, 10)
                    
                    NR = NR + 1
                    Count = Count + 1
                Loop
            
            'Get next filename ready
                fName = Dir
                ActiveWorkbook.Close False
        Loop
        
    'Cleanup
        Columns("A:D").AutoFit
        Application.DisplayAlerts = True
        Application.EnableEvents = True
        Application.ScreenUpdating = True
    End Sub
    Attached Files Attached Files

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1