在Word文档中,我们可以通过最简单的方法来拆分Word文档,那就是打开一个是需要拆分的文档的副本,删除我们不需要的内容然后将剩余内容保存为新文档到本地。虽然操作简单,但是一节一节的删除是个相当繁琐且枯燥的过程。利用Spire.Doc,我们可以程序化的根据分节符和分页符来拆分Word文档,这避免了手动冗杂的操作。本文将对此做详细介绍。
根据分节符拆分文档
C#
//加载测试文档
Document document = new Document();
document.LoadFromFile("测试文档.docx");
//拆分
Document newWord;
for (int i = 0; i < document.Sections.Count;i++ )
{
newWord = new Document();
newWord.Sections.Add(document.Sections[i].Clone());
newWord.SaveToFile(string.Format("拆分结果/分节符拆分的结果文档_{0}.docx", i));
i = (i + 1);
}
VB.NET
Dim document As Document = New Document
document.LoadFromFile("测试文档.docx")
Dim newWord As Document
Dim i As Integer = 0
Do While (i < document.Sections.Count)
newWord = New Document
newWord.Sections.Add(document.Sections(i).Clone)
newWord.SaveToFile(String.Format("拆分结果\分节符拆分的结果文档_{0}.docx", i))
i = (i + 1)
Loop
原文档如图,其中有两个分节符:
运行程序得到如下结果:
根据分页符拆分文档
C#
//加载源文档
Document original = new Document();
original.LoadFromFile("C:/Users/Administrator/Desktop/template.docx");
//创建一个新的文档并给它添加一个section
Document newWord = new Document();
Section section = newWord.AddSection();
int index = 0;
//遍历源文档的所有section,检测page break并根据page break拆分文档
foreach (Section sec in original.Sections)
{
foreach (DocumentObject obj in sec.Body.ChildObjects)
{
if (obj is Paragraph)
{
Paragraph para = obj as Paragraph;
section.Body.ChildObjects.Add(para.Clone());
foreach (DocumentObject parobj in para.ChildObjects)
{
if (parobj is Break && (parobj as Break).BreakType == BreakType.PageBreak)
{
int i = para.ChildObjects.IndexOf(parobj);
for (int j = i; j < para.ChildObjects.Count; j++)
{
section.Body.LastParagraph.ChildObjects.RemoveAt(i);
}
newWord.SaveToFile(String.Format("C:/Users/Administrator/Desktop/拆分结果/分页符拆分的结果文档-{0}.docx", index), FileFormat.Docx);
index++;
newWord = new Document();
section = newWord.AddSection();
section.Body.ChildObjects.Add(para.Clone());
while (i >= 0)
{
section.Paragraphs[0].ChildObjects.RemoveAt(i);
i--;
}
if (section.Paragraphs[0].ChildObjects.Count == 0)
{
section.Body.ChildObjects.RemoveAt(0);
}
}
}
}
if (obj is Table)
{
section.Body.ChildObjects.Add(obj.Clone());
}
}
}
newWord.SaveToFile(String.Format("C:/Users/Administrator/Desktop/拆分结果/分页符拆分的结果文档-{0}.docx", index), FileFormat.Docx);
VB.NET
Dim original As Document = New Document
original.LoadFromFile("C:\Users\Administrator\Desktop\template.docx")
Dim newWord As Document = New Document
Dim section As Section = newWord.AddSection
Dim index As Integer = 0
For Each sec As Section In original.Sections
For Each obj As DocumentObject In sec.Body.ChildObjects
If (TypeOf obj Is Paragraph) Then
Dim para As Paragraph = CType(obj,Paragraph)
section.Body.ChildObjects.Add(para.Clone)
For Each parobj As DocumentObject In para.ChildObjects
If (TypeOf parobj Is (Break _
AndAlso (CType(parobj,Break).BreakType = BreakType.PageBreak))) Then
Dim i As Integer = para.ChildObjects.IndexOf(parobj)
section.Body.LastParagraph.ChildObjects.RemoveAt(i)
newWord.SaveToFile(String.Format("C:\Users\Administrator\Desktop\拆分结果\分页符拆分的结果文档_{0}.docx", index), FileFormat.Docx)
index = (index + 1)
newWord = New Document
section = newWord.AddSection
section.Body.ChildObjects.Add(para.Clone)
If (section.Paragraphs(0).ChildObjects.Count = 0) Then
section.Body.ChildObjects.RemoveAt(0)
Else
While (i >= 0)
section.Paragraphs(0).ChildObjects.RemoveAt(i)
i = (i - 1)
End While
End If
End If
Next
End If
If (TypeOf obj Is Table) Then
section.Body.ChildObjects.Add(obj.Clone)
End If
Next
Next
newWord.SaveToFile(String.Format("C:\Users\Administrator\Desktop\拆分结果\分页符拆分的结果文档_{0}.docx", index), FileFormat.Docx)
原文档如图,其中有一个分页符:
运行程序得到如下结果: