Fix 404 Page Not Found Link Errors

Location: http://www.mvps.org/dmcritchie/ie/fix404.htm      
Home page: http://www.mvps.org/dmcritchie/excel/excel.htm
[View without Frames]

Table of 404 Errors

The following table was created from rearranging data created by a program you can download and run from your machine.

Xenu's Link Sleuth, Tilman Hausherr, probably seen on LockerGnome.
  [-- http://home.snafu.de/tilman/xenulink.html --]

Errors as reported

Broken links, ordered by page:

file:///c:\mywebsite\dmcritchie\excel\excel.htm
        http://groups.google.com/groups?oi=djq&ic=1&selm=an_400866288
          \_____ error code: 403 (forbidden request)
        http://groups.google.com/groups?oi=djq&as_umsgid=OEQqaSavAHA.1396@tkmsftngp05
          \_____ error code: 403 (forbidden request)
        http://triton.dsu.edu/uswest97/msoffice/excel.htm
          \_____ error code: 12002 (timeout)
        http://www.psych.ucsb.edu/~taap/resources/excel.html
          \_____ error code: 404 (not found)
        http://spreadsheetstyle.com/links/links.htm
          \_____ error code: 12007 (no such host)
        http://support.microsoft.com/support/excel/
          \_____ error code: 404 (not found)
        http://spreadsheetstyle.com/style/10tips.htm
          \_____ error code: 12007 (no such host)
        http://spreadsheetstyle.com/index.htm
          \_____ error code: 12007 (no such host)
        http://spreadsheetstyle.com/ReferenceBrowser/index.htm
Running macro:
Public Sub xenu_mac()
'Convert output from xenu pasted to a worksheet.
'David McRitchie   --  
'   http://www.mvps.org/dmcritchie/excel/fix404.htm
'   2003-07-15 
Dim nCol As Long, nRow As Long, cRow As Long, lastrow As Long
Dim wsSource As Worksheet, wsNew As Worksheet
Dim lastcell As Range
nCol = 0
nRow = 1

Dim I As Long
Set lastcell = Cells.SpecialCells(xlLastCell)
lastrow = lastcell.Row + 1  'adjustment to help with insureCol
Dim xStr As String
Set wsSource = ActiveSheet
Set wsNew = Worksheets.Add
Application.ScreenUpdating = False
Application.DisplayAlerts = False

 Cells.NumberFormat = "@"
For cRow = 1 To lastrow
   If InStr(1, xStr, "broken link(s)", 1) Then GoTo done
   xStr = wsSource.Cells(cRow, 1).Value
   If xStr = "" Then
     nRow = nRow + 1
     nCol = 0
   Else
     If Left(xStr, 10) = "          " Then
        nCol = 2
        If InStr(1, xStr, "403 (forbidden", 1) Then
           Cells(nRow, 1).EntireRow.Clear
           nRow = nRow - 1
           nCol = 0
           GoTo nextcrow
        End If
     ElseIf Left(xStr, 8) = "        " Then
        nCol = 1
        nRow = nRow + 1
     Else
        nRow = nRow + 1
        nCol = 0
        If Left(xStr, 5) = "http:" Then GoTo done
     End If
     xStr = Trim(xStr)
    ' If LCase(Left(xStr, 5)) = "http:" Then
    '    nCol = 0
    ' End If
    ' If InStr(1, xStr, "broken link", 1) Then
    '    nCol = 0
    ' End If
    ' If InStr(1, xStr, "======", 1) Then
    '    nCol = 0
    ' End If
     nCol = nCol + 1
     wsNew.Cells(nRow, nCol).Value = xStr
   End If
nextcrow:
Next cRow

done:
    Cells.Replace What:="\_____ error code: ", Replacement:="", LookAt _
        :=xlPart, SearchOrder:=xlByRows, MatchCase:=False
    Cells.Replace What:="file:///c:\mywebsite\dmcritchie\excel\", _
      Replacement:="", LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:= _
      False
    Columns("B:B").Select
    Selection.Replace What:=", linked from page(s):", Replacement:="", LookAt _
        :=xlPart, SearchOrder:=xlByRows, MatchCase:=False
    Selection.Replace What:="error code: ", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByRows, MatchCase:=False
    Columns("C:C").Select

    With ActiveSheet.PageSetup
        .Orientation = xlLandscape
        .FitToPagesWide = 1
        .LeftMargin = Application.InchesToPoints(0.5)
        .RightMargin = Application.InchesToPoints(0.5)
        .TopMargin = Application.InchesToPoints(0.25)
        .BottomMargin = Application.InchesToPoints(0.25)
        .HeaderMargin = Application.InchesToPoints(0.2)
        .FooterMargin = Application.InchesToPoints(0.65)
        .PrintGridlines = True
        .PaperSize = xlPaperLetter
        .FirstPageNumber = xlAutomatic
        .Order = xlDownThenOver
        .BlackAndWhite = False
        .Zoom = 100
    End With
'other format  is 84,   22,   13,  5,  5,  5, 5
    Columns("A:A").ColumnWidth = 9
    Columns("B:B").ColumnWidth = 91
    Columns("C:C").ColumnWidth = 20
    Columns("D:G").ColumnWidth = 5
    ActiveWindow.SelectedSheets.PrintPreview
Application.ScreenUpdating = True     'place at end when debugged
Application.DisplayAlerts = True
End Sub

Table produced

 ABCDE
1      
2 Broken links, ordered by page:   
3      
4excel.htm    [P]
5   http://triton.dsu.edu/uswest97/msoffice/excel.htm 12002 (timeout)[A]
6   http://www.psych.ucsb.edu/~taap/resources/excel.html 404 (not found)[A]
7  Fixed http://spreadsheetstyle.com/links/links.htm 12007 (no such host)[A]
8  Fixed http://spreadsheetstyle.com/style/10tips.htm 12007 (no such host)[A]
9  FixedHiding 2 (file not found)[A]
10  Fixed http://www.payneconsulting.com/Office/Excel/pivotarticle/pivot.htm 404 (not found)[A]
11   %3Ca%20href=3 (invalid path)[A]
12   http://www.free-ed.net/fr03/lfc/030202/120/500 (server error) [A]
13   http://www1.monumental.com/zoogie/ucod/ExtensionFinder/extensio.txt 12002 (timeout)[A]
Inserted Column B afterwards for Comments

Event macros are shown below is used by double-clicking on one of the columns.  If the links are left intact in Column C, then the following Worksheet Function could be used in place of the macro.  Your choice, but the macro is easier to update, and was written as an example for my event.htm for the Shell example of invoking Internet Explorer, other methods are listed there.

Sample formula for cell E8:
=IF(C8<>"",HYPERLINK("http://web.archive.org/web/*/"&C8,"[A]"),
 IF(A8<>"",HYPERLINK("c:\mywebsite\dmcritchie\excel\"&A8,"[P]"),  ""))

Event Macro to work with the above Table

Option Explicit
Private Sub Worksheet_BeforeDoubleClick(ByVal Target As Range, Cancel As Boolean)
   'David McRitchie, 2003-07-15, programming
   ' ColA: local HTML,  ColB: notes, ColC: bad link, ColD: link error
   Dim filename As String, IEpath As String
   IEpath = "C:\program files\internet explorer\iexplore.exe"
   If ActiveCell.Column = 1 Then
     If Right(LCase(ActiveCell.Value), 4) = ".htm" Then
       filename = "c:\mywebsite\dmcritchie\excel\" _
           & Trim(ActiveCell.Value)
       Cancel = True    '--no need or further need to edit cell
       Shell IEpath & " " & filename, vbNormalFocus
     End If
   ElseIf ActiveCell.Column = 2 Or ActiveCell.Column = 4 Then
       filename = "http://www.archive.org/web/*/" _
           & Trim(Cells(ActiveCell.Row, 3).Value)
       Cancel = True    '--no need or further need to edit cell
       Shell IEpath & " " & filename, vbNormalFocus
   ElseIf ActiveCell.Column = 3 Then
       filename = Trim(Cells(ActiveCell.Row, 3).Value)
       Cancel = True    '--no need or further need to edit cell
       Shell IEpath & " " & filename, vbNormalFocus
   Else
     Cancel = True
     Exit Sub
   End If
End Sub
Use of   Cancel = True   was suggested by Dave Peterson, programming, 2003-07-15.

XENU options used

Xenu's Link Sleuthing Options
   Parallel Threads         Apply to 
                            all jobs
     1


              Current:         [ OK ]
                30

   100                        [Cancel]


   Maximum level:   1

    X Ask for password
      when needed

      Treat redirections as errors

   Report
    X  Broken links, ordered by links
    X  Broken links, ordered by page
    X  Broken local links
    X  Redirected URLs

       Ftp and gopher URLs
       Valid text URLs

       Site Map
       Statistics
       Local orphans files
Have to look up what is meant by parallel threads.  There is apreadsheet option somewhere have to find it and make sure it shows file, url, url-error, url-title

Related


This page was introduced on July 16, 2003. 

[My Excel Pages -- home]    [INDEX to my site and the off-site pages I reference] 
[Site Search -- Excel]     [Go Back]    [Return to TOP

Please send your comments concerning this web page to: David McRitchie send email comments


Copyright © 1997 - 2004,  F. David McRitchie,  All Rights Reserved