C#/VB.NET 根据分节符和分页符拆分 Word 文档

sales@e-iceblue.com

|

028-81705109

|

2790765778

|

微信扫一扫

|

论坛

在线编辑/WebAPI

帮助文档
文档操作
C#/VB.NET 根据分节符和分页符拆分 Word 文档

C#/VB.NET 根据分节符和分页符拆分 Word 文档

在Word文档中，我们可以通过最简单的方法来拆分Word文档，那就是打开一个是需要拆分的文档的副本，删除我们不需要的内容然后将剩余内容保存为新文档到本地。虽然操作简单，但是一节一节的删除是个相当繁琐且枯燥的过程。利用Spire.Doc，我们可以程序化的根据分节符和分页符来拆分Word文档，这避免了手动冗杂的操作。本文将对此做详细介绍。

根据分节符拆分文档

C#

//加载测试文档
            Document document = new Document();
            document.LoadFromFile("测试文档.docx");

            //拆分
            Document newWord;

            for (int i = 0; i < document.Sections.Count;i++ )
            {
                newWord = new Document();
                newWord.Sections.Add(document.Sections[i].Clone());
                newWord.SaveToFile(string.Format("拆分结果/分节符拆分的结果文档_{0}.docx", i));
                i = (i + 1);
            }

VB.NET

Dim document As Document = New Document
document.LoadFromFile("测试文档.docx")
Dim newWord As Document
Dim i As Integer = 0
Do While (i < document.Sections.Count)
    newWord = New Document
    newWord.Sections.Add(document.Sections(i).Clone)
    newWord.SaveToFile(String.Format("拆分结果\分节符拆分的结果文档_{0}.docx", i))
    i = (i + 1)
Loop

原文档如图，其中有两个分节符：

C# 根据分节符和分页符拆分 Word 文档

运行程序得到如下结果：

C# 根据分节符和分页符拆分 Word 文档

根据分页符拆分文档

C#

//加载源文档
            Document original = new Document();
            original.LoadFromFile("C:/Users/Administrator/Desktop/template.docx");

            //创建一个新的文档并给它添加一个section
            Document newWord = new Document();
            Section section = newWord.AddSection();

            int index = 0;
            //遍历源文档的所有section，检测page break并根据page break拆分文档
            foreach (Section sec in original.Sections)
            {
                foreach (DocumentObject obj in sec.Body.ChildObjects)
                {
                    if (obj is Paragraph)
                    {
                        Paragraph para = obj as Paragraph;
                        section.Body.ChildObjects.Add(para.Clone());

                        foreach (DocumentObject parobj in para.ChildObjects)
                        {
                            if (parobj is Break && (parobj as Break).BreakType == BreakType.PageBreak)
                            {
                                int i = para.ChildObjects.IndexOf(parobj);
                                for (int j = i; j < para.ChildObjects.Count; j++)
                                {
                                    section.Body.LastParagraph.ChildObjects.RemoveAt(i);
                                }
                                newWord.SaveToFile(String.Format("C:/Users/Administrator/Desktop/拆分结果/分页符拆分的结果文档-{0}.docx", index), FileFormat.Docx);

                                index++;
                                newWord = new Document();
                                section = newWord.AddSection();
                                section.Body.ChildObjects.Add(para.Clone());
                                while (i >= 0)
                                {
                                    section.Paragraphs[0].ChildObjects.RemoveAt(i);
                                    i--;
                                }

                                if (section.Paragraphs[0].ChildObjects.Count == 0)
                                {
                                    section.Body.ChildObjects.RemoveAt(0);
                                }
                            }
                        }
                    }
                    if (obj is Table)
                    {
                        section.Body.ChildObjects.Add(obj.Clone());
                    }
                }
            }
            newWord.SaveToFile(String.Format("C:/Users/Administrator/Desktop/拆分结果/分页符拆分的结果文档-{0}.docx", index), FileFormat.Docx);

VB.NET

Dim original As Document = New Document
original.LoadFromFile("C:\Users\Administrator\Desktop\template.docx")
Dim newWord As Document = New Document
Dim section As Section = newWord.AddSection
Dim index As Integer = 0
For Each sec As Section In original.Sections
    For Each obj As DocumentObject In sec.Body.ChildObjects
        If (TypeOf obj Is Paragraph) Then
            Dim para As Paragraph = CType(obj,Paragraph)
            section.Body.ChildObjects.Add(para.Clone)
            For Each parobj As DocumentObject In para.ChildObjects
                If (TypeOf parobj Is (Break  _
                            AndAlso (CType(parobj,Break).BreakType = BreakType.PageBreak))) Then
                    Dim i As Integer = para.ChildObjects.IndexOf(parobj)
                    section.Body.LastParagraph.ChildObjects.RemoveAt(i)
                    newWord.SaveToFile(String.Format("C:\Users\Administrator\Desktop\拆分结果\分页符拆分的结果文档_{0}.docx", index), FileFormat.Docx)
                    index = (index + 1)
                    newWord = New Document
                    section = newWord.AddSection
                    section.Body.ChildObjects.Add(para.Clone)
                    If (section.Paragraphs(0).ChildObjects.Count = 0) Then
                        section.Body.ChildObjects.RemoveAt(0)
                    Else
                        
                        While (i >= 0)
                            section.Paragraphs(0).ChildObjects.RemoveAt(i)
                            i = (i - 1)
                            
                        End While
                        
                    End If
                    
                End If
                
            Next
        End If
        
        If (TypeOf obj Is Table) Then
            section.Body.ChildObjects.Add(obj.Clone)
        End If
        
    Next
Next
newWord.SaveToFile(String.Format("C:\Users\Administrator\Desktop\拆分结果\分页符拆分的结果文档_{0}.docx", index), FileFormat.Docx)

原文档如图，其中有一个分页符：

C# 根据分节符和分页符拆分 Word 文档

运行程序得到如下结果：

C# 根据分节符和分页符拆分 Word 文档

相关文章